From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 09:17:14 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06347106566B; Mon, 8 Nov 2010 09:17:14 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id DA85A8FC19; Mon, 8 Nov 2010 09:17:12 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA25662; Mon, 08 Nov 2010 11:17:08 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PFNqd-0003iD-Uy; Mon, 08 Nov 2010 11:17:08 +0200 Message-ID: <4CD7C013.8010502@freebsd.org> Date: Mon, 08 Nov 2010 11:17:07 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: jhell References: <4C91F031.1010801@freebsd.org> <20101010134717.GA59922@deviant.kiev.zoral.com.ua> <4CD2E5AA.1090208@freebsd.org> <4CD48259.6030402@DataIX.net> In-Reply-To: <4CD48259.6030402@DataIX.net> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Alan Cox , Pawel Jakub Dawidek , freebsd-fs@freebsd.org Subject: Re: vop_getpages for zfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 09:17:14 -0000 on 06/11/2010 00:16 jhell said the following: > Thought I would give you a heads up on this after seeing the post about > zfs_getpages.diff. > > I patched up to this before after seeing it posted to @ > and got reliable dumpage from it. Basically a vm_page_unwire fault or > something like that. I believe the following is the backtrace from that. > > (kgdb) #0 doadump () at pcpu.h:231 > #1 0x80675251 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 > #2 0x806754e5 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:592 > #3 0x808e27ce in vm_page_unwire (m=0x816b46e0, activate=1) > at /usr/src/sys/vm/vm_page.c:1564 > #4 0x808d123a in vm_fault_unwire (map=0x81690088, start=2568372224, > end=2568433664, fictitious=0) at /usr/src/sys/vm/vm_fault.c:1123 > #5 0x808d8a33 in vm_map_delete (map=0x81690088, start=2568372224, > end=2568433664) at /usr/src/sys/vm/vm_map.c:2619 > #6 0x808d8d0d in vm_map_remove (map=0x81690088, start=2568372224, > end=Variable "end" is not available. > ) > at /usr/src/sys/vm/vm_map.c:2801 > #7 0x808d6360 in kmem_free (map=0x81690088, addr=2568372224, size=61440) > at /usr/src/sys/vm/vm_kern.c:211 > #8 0x808cb601 in page_free (mem=0x99164000, size=61440, flags=34 '"') > at /usr/src/sys/vm/uma_core.c:1069 > #9 0x808ccf92 in uma_large_free (slab=0x85be7e6c) > at /usr/src/sys/vm/uma_core.c:3021 > #10 0x8065f7a5 in free (addr=0x99164000, mtp=0x80d8e114) > at /usr/src/sys/kern/kern_malloc.c:506 > #11 0x80cd49db in zil_lwb_write_done () from /boot/kernel/zfs.ko > #12 0x80cd99b1 in zio_done () from /boot/kernel/zfs.ko > #13 0x80cd7d3a in zio_execute () from /boot/kernel/zfs.ko > #14 0x80cd7f2e in zio_notify_parent () from /boot/kernel/zfs.ko > #15 0x80cd9a31 in zio_done () from /boot/kernel/zfs.ko > #16 0x80cd7d3a in zio_execute () from /boot/kernel/zfs.ko > #17 0x80c6656b in taskq_run_safe () from /boot/kernel/zfs.ko > #18 0x806af812 in taskqueue_run (queue=0x85a0ac40) > at /usr/src/sys/kern/subr_taskqueue.c:239 > #19 0x806afa07 in taskqueue_thread_loop (arg=0x85a3f830) > at /usr/src/sys/kern/subr_taskqueue.c:360 > #20 0x80647377 in fork_exit (callout=0x806af94a , > arg=0x85a3f830, frame=0xb4439d38) at /usr/src/sys/kern/kern_fork.c:845 > #21 0x809126a4 in fork_trampoline () at > /usr/src/sys/i386/i386/exception.s:273 > (kgdb) > > And that coincided with the dates that I added the patch once seeing it > on the list. > changeset: 351:f1ca4eb51520 > user: J. Hellenthal > date: Sun Oct 10 22:57:24 2010 -0400 > summary: Remove the zfs_getpages patch from Andriy Gapon > > changeset: 350:bb885c047f0a > user: J. Hellenthal > date: Sun Oct 10 22:27:31 2010 -0400 > summary: zfs_getpages improvement from Andriy Gapon > > If you would like I can patch back up to this patch to provide more > information if its needed, but at the moment I do not have it available > nor do I have the core that was generated. Thanks a lot for this report. Actually the original patch/commit was intended for head only as there are some differences in page locking between head and other branches. I have almost forgot about that and would certainly do that if not for your report. See r214936 and r214941. Thanks! -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 09:55:11 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B2431065672; Mon, 8 Nov 2010 09:55:11 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5DAC08FC0A; Mon, 8 Nov 2010 09:55:10 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA26760; Mon, 08 Nov 2010 11:55:09 +0200 (EET) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PFORQ-0003lV-M8; Mon, 08 Nov 2010 11:55:08 +0200 Message-ID: <4CD7C8FC.900@icyb.net.ua> Date: Mon, 08 Nov 2010 11:55:08 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-current@FreeBSD.ORG, freebsd-fs@FreeBSD.ORG X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=X-VIET-VPS Content-Transfer-Encoding: 7bit Cc: Subject: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 09:55:11 -0000 JFYI. Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x0 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff80372a64 stack pointer = 0x28:0xffffff81265486f0 frame pointer = 0x28:0xffffff8126548700 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 4080 (initial thread) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff801b9b8a = db_trace_self_wrapper+0x2a kdb_backtrace() at 0xffffffff803b36ba = kdb_backtrace+0x3a panic() at 0xffffffff8037c8b2 = panic+0x1d2 trap_fatal() at 0xffffffff8055b35d = trap_fatal+0x39d trap_pfault() at 0xffffffff8055b638 = trap_pfault+0x2b8 trap() at 0xffffffff8055bd33 = trap+0x603 calltrap() at 0xffffffff80545f78 = calltrap+0x8 --- trap 0xc, rip = 0xffffffff80372a64, rsp = 0xffffff81265486f0, rbp = 0xffffff8126548700 --- crhold() at 0xffffffff80372a64 = crhold+0x4 fdata_alloc() at 0xffffffff80e17a9f = fdata_alloc+0xcf fusedev_open() at 0xffffffff80e1896e = fusedev_open+0xae devfs_open() at 0xffffffff802e8fa7 = devfs_open+0x117 VOP_OPEN_APV() at 0xffffffff805bb0c4 = VOP_OPEN_APV+0x74 vn_open_cred() at 0xffffffff804222bd = vn_open_cred+0x4ad vn_open() at 0xffffffff804223dc = vn_open+0x1c kern_openat() at 0xffffffff80420bad = kern_openat+0x15d kern_open() at 0xffffffff80420f29 = kern_open+0x19 open() at 0xffffffff80420f48 = open+0x18 syscallenter() at 0xffffffff803c0f9e = syscallenter+0x3be syscall() at 0xffffffff8055b6b1 = syscall+0x41 Xfast_syscall() at 0xffffffff80546252 = Xfast_syscall+0xe2 NULL pointer is passed as an argument to crhold. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 11:06:57 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AB571065712 for ; Mon, 8 Nov 2010 11:06:57 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C9A4B8FC13 for ; Mon, 8 Nov 2010 11:06:56 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA8B6urQ088083 for ; Mon, 8 Nov 2010 11:06:56 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA8B6u6O088081 for freebsd-fs@FreeBSD.org; Mon, 8 Nov 2010 11:06:56 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 8 Nov 2010 11:06:56 GMT Message-Id: <201011081106.oA8B6u6O088081@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 11:06:57 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151910 fs [zfs] booting from raidz/raidz2 on ciss(4) doesn't wor o kern/151905 fs [zfs] page fault under load in /sbin/zfs o kern/151845 fs [smbfs] [patch] smbfs should be upgraded to support Un o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/151082 fs [zfs] [patch] sappend-flaged files on ZFS not working o kern/150796 fs [panic] [suj] [ufs] [softupdates] Panic on portbuild o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/150207 fs zpool(1): zpool import -d /dev tries to open weird dev o kern/149855 fs [gvinum] growfs causes fsck to report errors in Filesy o kern/149495 fs [zfs] chflags sappend on zfs not working right o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149022 fs [hang] File system operations hangs with suspfs state o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o bin/148296 fs [zfs] [loader] [patch] Very slow probe in /usr/src/sys o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server o kern/146375 fs [nfs] [patch] Typos in macro variables names in sys/fs s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c o kern/144458 fs [nfs] [patch] nfsd fails as a kld p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143345 fs [ext2fs] [patch] extfs minor header cleanups to better o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142924 fs [ext2fs] [patch] Small cleanup for the inode struct in o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142401 fs [ntfs] [patch] Minor updates to NTFS from NetBSD o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135667 fs [lor] LORs causing ufs filesystem corruption on XEN Do o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [ffs] [snapshot] System crashes when manipulat o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp f kern/115645 fs [ffs] [snapshots] [panic] lockmgr: thread 0xc4c00d80, o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o bin/94635 fs snapinfo(8)/libufs only works for disk-backed filesyst o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. f kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/33464 fs [ufs] soft update inconsistencies after system crash o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 214 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 11:55:05 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1810A106566B; Mon, 8 Nov 2010 11:55:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E8F6E8FC0C; Mon, 8 Nov 2010 11:55:03 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA29560; Mon, 08 Nov 2010 13:55:02 +0200 (EET) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PFQJS-0003vD-4w; Mon, 08 Nov 2010 13:55:02 +0200 Message-ID: <4CD7E515.5040209@icyb.net.ua> Date: Mon, 08 Nov 2010 13:55:01 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Ivan Voras References: <4CD7C8FC.900@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 11:55:05 -0000 on 08/11/2010 13:35 Ivan Voras said the following: > On 11/08/10 10:55, Andriy Gapon wrote: >> >> JFYI. >> Fatal trap 12: page fault while in kernel mode > > Can you find any set of circumstances which make this repeatable? > > This panic apparently goes like this: > > 1) used by devfs_open(): > 47 static struct cdevsw fuse_cdevsw = { > 48 .d_open = fusedev_open, > > 2) in fusedev_open(): > 119 fdata = fdata_alloc(dev, td->td_ucred); > > 3) in fdata_alloc(): > 297 data->daemoncred = crhold(cred); > > in other words, td->td_ucred from td passed to fusedev_open (presumably > when the device is opened from the userland) appears to be NULL. > > I don't know if there is any normal set of circumstances under which > this is expected. I reliable got this panic when all I was doing is saving an attachment in thunderbird 3 that ran in KDE 4 environment. Not sure what was going on behind the scenes, but shouldn't have been anything out of the ordinary. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 12:13:23 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 683B61065679; Mon, 8 Nov 2010 12:13:23 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 4D5C08FC26; Mon, 8 Nov 2010 12:13:21 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA29823; Mon, 08 Nov 2010 14:13:20 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PFQbA-0003x3-Ia; Mon, 08 Nov 2010 14:13:20 +0200 Message-ID: <4CD7E960.1070200@freebsd.org> Date: Mon, 08 Nov 2010 14:13:20 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Ivan Voras References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> In-Reply-To: <4CD7E515.5040209@icyb.net.ua> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 12:13:23 -0000 on 08/11/2010 13:55 Andriy Gapon said the following: > I reliable got this panic when all I was doing is saving an attachment in > thunderbird 3 that ran in KDE 4 environment. Not sure what was going on behind > the scenes, but shouldn't have been anything out of the ordinary. Perhaps this is my local mistake. I can't see from code and crash dump how NULL pointer is possible there. So perhaps I have some ABI mismatch between kernel and fuse module. I will rebuild fuse kmod and re-test again. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 12:28:42 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C459106564A for ; Mon, 8 Nov 2010 12:28:42 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 11D758FC15 for ; Mon, 8 Nov 2010 12:28:41 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PFQq0-0004bC-6N for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 13:28:40 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 13:28:40 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 13:28:40 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Mon, 08 Nov 2010 13:28:26 +0100 Lines: 19 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 X-Enigmail-Version: 1.1.2 Subject: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 12:28:42 -0000 I was looking at fusefs sources and there is a dance it does with the Giant lock which looks fishy. Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of mentionings, but if I understand it correctly only these "active" instances: 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure out what they're guarding 2) some manual locking and unlocking in nfsclient which appears to only guard printf() (???) 3) some more locking in nfsserver which apparently is only there to guard the underlying local file system 4) coda, which appears to be the only one marked with D_NEEDGIANT, but doesn't do much of its own interfacing with it Except for these, is there any more magic that would need to be resolved to excise Giant from VFS? Would it be correct to think that coda is the single biggest obstacle? From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 12:41:04 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95CE110656C5 for ; Mon, 8 Nov 2010 12:41:04 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 4C5818FC25 for ; Mon, 8 Nov 2010 12:41:03 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PFR1y-0001F5-Sd for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 13:41:02 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 13:41:02 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 13:41:02 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Mon, 08 Nov 2010 13:40:51 +0100 Lines: 14 Message-ID: References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <4CD7E960.1070200@freebsd.org> X-Enigmail-Version: 1.1.2 Cc: freebsd-current@freebsd.org Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 12:41:04 -0000 On 11/08/10 13:13, Andriy Gapon wrote: > on 08/11/2010 13:55 Andriy Gapon said the following: >> I reliable got this panic when all I was doing is saving an attachment in >> thunderbird 3 that ran in KDE 4 environment. Not sure what was going on behind >> the scenes, but shouldn't have been anything out of the ordinary. > > Perhaps this is my local mistake. I can't see from code and crash dump how NULL > pointer is possible there. So perhaps I have some ABI mismatch between kernel > and fuse module. > I will rebuild fuse kmod and re-test again. OTOH it could be one of those supposed memory corruptions of FUSE, but this particular backtrace is pretty tight, I can't see anything that can NULL-ify something in the struct thread. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 14:32:35 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E40BF106566B; Mon, 8 Nov 2010 14:32:35 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4423B8FC19; Mon, 8 Nov 2010 14:32:34 +0000 (UTC) Received: by bwz3 with SMTP id 3so4924375bwz.13 for ; Mon, 08 Nov 2010 06:32:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=7VW5+8jQ+fRzpA3ohr1dYewahGI6mBkL1elER2pkx0c=; b=eCzFYxwhNTr6AaFtTgYTwRSF33lWCsyabha8NyJaJzz07FWCFshXMhgK/WLHLwefFW iCrNAjTpJHGDQrTeWbANn3KiTtd6xkUFdfkzHOYSbf8hqo/nQX7nSphkT9Fu8NmWwEnm nJHNeHOUg88HNZVxrbBmLYdj+1/99lXIfvrt4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=F/Xcusv9TFJ3BJHYORP5j8QDMI8dWIwJYlWh5l3PDGKcPuOzoE9tVkbfQly/gOQtE/ 82Rqnn1SsXzp2kypTFqYQw5vzd7OPw2prag7xjfqNM9gVV2F6k17WkUMq0G2Zb3RwwSK x3bL5a5fx8+FqJFzW9bqDGu/1NGDAtSvuZ1jc= Received: by 10.204.62.193 with SMTP id y1mr4957723bkh.131.1289226695087; Mon, 08 Nov 2010 06:31:35 -0800 (PST) Received: from localhost ([91.187.5.20]) by mx.google.com with ESMTPS id r21sm3911608bkj.22.2010.11.08.06.31.33 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 08 Nov 2010 06:31:33 -0800 (PST) Date: Mon, 8 Nov 2010 16:31:30 +0200 From: Gleb Kurtsou To: Ivan Voras Message-ID: <20101108143130.GA2799@tops> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 14:32:36 -0000 On (08/11/2010 13:28), Ivan Voras wrote: > I was looking at fusefs sources and there is a dance it does with the > Giant lock which looks fishy. It's intended to be fishy. No kernel level locks should be held before returning to userland, in other words on each syscall vnode is locked (+ Gaint lock for fs if needed), than it's unlocked by filesystem and relocked upon callback from userspace. puffs is MPSAFE if that could be of any help for you. > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of > mentionings, but if I understand it correctly only these "active" instances: > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure > out what they're guarding > 2) some manual locking and unlocking in nfsclient which appears to only > guard printf() (???) Somewhat unrelated, but. Does NFS client unlock vnodes while sending/waiting for RCP reply? I thought it does, but I'm not sure. > 3) some more locking in nfsserver which apparently is only there to > guard the underlying local file system > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but > doesn't do much of its own interfacing with it > > Except for these, is there any more magic that would need to be resolved > to excise Giant from VFS? Kostik was working on it. > Would it be correct to think that coda is the single biggest obstacle? Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs, smbfs, nwfs, ntfs, etc. > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 14:38:43 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6B03F10656AB for ; Mon, 8 Nov 2010 14:38:43 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 2335D8FC20 for ; Mon, 8 Nov 2010 14:38:42 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PFSrp-0005bJ-I7 for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 15:38:41 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 15:38:41 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 15:38:41 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Mon, 08 Nov 2010 15:38:30 +0100 Lines: 21 Message-ID: References: <20101108143130.GA2799@tops> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <20101108143130.GA2799@tops> X-Enigmail-Version: 1.1.2 Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 14:38:43 -0000 On 11/08/10 15:31, Gleb Kurtsou wrote: > On (08/11/2010 13:28), Ivan Voras wrote: >> I was looking at fusefs sources and there is a dance it does with the >> Giant lock which looks fishy. > It's intended to be fishy. No kernel level locks should be held before > returning to userland, in other words on each syscall vnode is locked (+ > Gaint lock for fs if needed), than it's unlocked by filesystem and > relocked upon callback from userspace. puffs is MPSAFE if that could be > of any help for you. I don't think we're talking completely about the same thing here. I'm talking about fuse's DO_GIANT_MANUALLY flag, with awareness that fuse does: 473 #ifdef MNTK_MPSAFE 474 mp->mnt_kern_flag |= MNTK_MPSAFE; 475 #endif What are you talking about? From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 15:05:05 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D93851065697; Mon, 8 Nov 2010 15:05:05 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id A67D68FC1D; Mon, 8 Nov 2010 15:05:05 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 590B446B06; Mon, 8 Nov 2010 10:05:05 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 85A328A029; Mon, 8 Nov 2010 10:05:04 -0500 (EST) From: John Baldwin To: freebsd-current@freebsd.org Date: Mon, 8 Nov 2010 09:42:41 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: <4CD7C8FC.900@icyb.net.ua> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201011080942.41546.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 08 Nov 2010 10:05:04 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 15:05:06 -0000 On Monday, November 08, 2010 6:35:55 am Ivan Voras wrote: > On 11/08/10 10:55, Andriy Gapon wrote: > > > > JFYI. > > Fatal trap 12: page fault while in kernel mode > > Can you find any set of circumstances which make this repeatable? > > This panic apparently goes like this: > > 1) used by devfs_open(): > 47 static struct cdevsw fuse_cdevsw = { > 48 .d_open = fusedev_open, > > 2) in fusedev_open(): > 119 fdata = fdata_alloc(dev, td->td_ucred); > > 3) in fdata_alloc(): > 297 data->daemoncred = crhold(cred); > > in other words, td->td_ucred from td passed to fusedev_open (presumably > when the device is opened from the userland) appears to be NULL. > > I don't know if there is any normal set of circumstances under which > this is expected. No, td_ucred should never be NULL. -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 15:05:10 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6744A10656BD; Mon, 8 Nov 2010 15:05:10 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 3282C8FC12; Mon, 8 Nov 2010 15:05:10 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id CB41046B35; Mon, 8 Nov 2010 10:05:09 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A65268A027; Mon, 8 Nov 2010 10:05:08 -0500 (EST) From: John Baldwin To: freebsd-fs@freebsd.org Date: Mon, 8 Nov 2010 10:04:59 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Message-Id: <201011081004.59640.jhb@freebsd.org> Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 08 Nov 2010 10:05:08 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: Ivan Voras Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 15:05:10 -0000 On Monday, November 08, 2010 7:28:26 am Ivan Voras wrote: > I was looking at fusefs sources and there is a dance it does with the > Giant lock which looks fishy. > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of > mentionings, but if I understand it correctly only these "active" instances: > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure > out what they're guarding > 2) some manual locking and unlocking in nfsclient which appears to only > guard printf() (???) > 3) some more locking in nfsserver which apparently is only there to > guard the underlying local file system > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but > doesn't do much of its own interfacing with it > > Except for these, is there any more magic that would need to be resolved > to excise Giant from VFS? > > Would it be correct to think that coda is the single biggest obstacle? Err, all the VFS_LOCK_GIANT() stuff for filesystems that do not have MNTK_MPSAFE set. I believe the currently MPSAFE fs's are UFS, ZFS, MSDOSFS, CD9660, UDF, NFS client, and devfs. I think all others are !MPSAFE still. -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 15:08:12 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0C091065672; Mon, 8 Nov 2010 15:08:12 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8C3578FC0C; Mon, 8 Nov 2010 15:08:12 +0000 (UTC) Received: by pvc22 with SMTP id 22so1343640pvc.13 for ; Mon, 08 Nov 2010 07:08:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:cc:content-type:content-transfer-encoding; bh=1y/z5S8bvr0wnQxCmQAL8gHVfZt+to3OEni7ST3Gy3M=; b=uPWFt+qy1zDiShhx/XOYSk+eyqxYjBiVN15ujbFO9UPY8FRAiGDc5KC/PLm70CLh27 eQmV9/4DKyDWG+5tz1rgf6MhMUtzOJ1109dpl945vNxVebRqxd3RUbzViHGH1Cg1ayE8 r0uRbC1PqUHbxs5V4guraqlKQe4FMnF18KsZY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=fUyMbRYYbIjo9CfOMVZo4F1P74ntuncrHlKkvaH3C/2hBs5Hb/9/XY3QTbyDqEdLm3 GswIJC97SrYA1lBeEUp6UP4j4GWEtRBfmkJKbsbtS7frzDZk9qF1VSf3eKB9ajBVjaNk FEQCIZvcBPaei9ZkslQ8BML33NSSaQiB6rhVA= Received: by 10.229.246.136 with SMTP id ly8mr5134771qcb.237.1289228891427; Mon, 08 Nov 2010 07:08:11 -0800 (PST) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.229.40.145 with HTTP; Mon, 8 Nov 2010 07:07:29 -0800 (PST) In-Reply-To: <201011081004.59640.jhb@freebsd.org> References: <201011081004.59640.jhb@freebsd.org> From: Ivan Voras Date: Mon, 8 Nov 2010 16:07:29 +0100 X-Google-Sender-Auth: x8ab4LFurqRWZNp0ap7bTcPEfmA Message-ID: To: John Baldwin Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 15:08:12 -0000 On 8 November 2010 16:04, John Baldwin wrote: > On Monday, November 08, 2010 7:28:26 am Ivan Voras wrote: >> I was looking at fusefs sources and there is a dance it does with the >> Giant lock which looks fishy. >> >> Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of >> mentionings, but if I understand it correctly only these "active" instan= ces: >> >> 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure >> out what they're guarding >> 2) some manual locking and unlocking in nfsclient which appears to only >> guard printf() (???) >> 3) some more locking in nfsserver which apparently is only there to >> guard the underlying local file system >> 4) coda, which appears to be the only one marked with D_NEEDGIANT, but >> doesn't do much of its own interfacing with it >> >> Except for these, is there any more magic that would need to be resolved >> to excise Giant from VFS? >> >> Would it be correct to think that coda is the single biggest obstacle? > > Err, all the VFS_LOCK_GIANT() stuff for filesystems that do not have > MNTK_MPSAFE set. =C2=A0I believe the currently MPSAFE fs's are UFS, ZFS, = MSDOSFS, > CD9660, UDF, NFS client, and devfs. =C2=A0I think all others are !MPSAFE = still. Thanks! It seemed too easy to be true. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 15:10:33 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6734D1065675; Mon, 8 Nov 2010 15:10:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id D41788FC2B; Mon, 8 Nov 2010 15:10:32 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id oA8FATQW000494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Nov 2010 17:10:29 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id oA8FASWf017413; Mon, 8 Nov 2010 17:10:28 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id oA8FASPR017412; Mon, 8 Nov 2010 17:10:28 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 8 Nov 2010 17:10:28 +0200 From: Kostik Belousov To: John Baldwin Message-ID: <20101108151028.GI2392@deviant.kiev.zoral.com.ua> References: <201011081004.59640.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SvH2i/6Q4qfAo9X5" Content-Disposition: inline In-Reply-To: <201011081004.59640.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_20, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 15:10:33 -0000 --SvH2i/6Q4qfAo9X5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 08, 2010 at 10:04:59AM -0500, John Baldwin wrote: > On Monday, November 08, 2010 7:28:26 am Ivan Voras wrote: > > I was looking at fusefs sources and there is a dance it does with the > > Giant lock which looks fishy. > >=20 > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of > > mentionings, but if I understand it correctly only these "active" insta= nces: > >=20 > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure > > out what they're guarding > > 2) some manual locking and unlocking in nfsclient which appears to only > > guard printf() (???) > > 3) some more locking in nfsserver which apparently is only there to > > guard the underlying local file system > > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but > > doesn't do much of its own interfacing with it > >=20 > > Except for these, is there any more magic that would need to be resolved > > to excise Giant from VFS? > >=20 > > Would it be correct to think that coda is the single biggest obstacle? >=20 > Err, all the VFS_LOCK_GIANT() stuff for filesystems that do not have=20 > MNTK_MPSAFE set. I believe the currently MPSAFE fs's are UFS, ZFS, MSDOS= FS, > CD9660, UDF, NFS client, and devfs. I think all others are !MPSAFE still. pseudofs-based fses are mpsafe too. I already claimed several times that I will remove VFS_LOCK_GIANT after smbfs is locked. Patch for removal is sitting in my repository for almost a year. --SvH2i/6Q4qfAo9X5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAkzYEuQACgkQC3+MBN1Mb4j9awCgjT/sNXF737obeLixZFmTJgZU mwwAoMXrnVc+Jxaoz0R9iXmeV0TKyDXn =5/2O -----END PGP SIGNATURE----- --SvH2i/6Q4qfAo9X5-- From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 15:15:51 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 461731065670; Mon, 8 Nov 2010 15:15:51 +0000 (UTC) (envelope-from pjd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 1BA7E8FC1A; Mon, 8 Nov 2010 15:15:51 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA8FFoG4053454; Mon, 8 Nov 2010 15:15:50 GMT (envelope-from pjd@freefall.freebsd.org) Received: (from pjd@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA8FFoB1053450; Mon, 8 Nov 2010 15:15:50 GMT (envelope-from pjd) Date: Mon, 8 Nov 2010 15:15:50 GMT Message-Id: <201011081515.oA8FFoB1053450@freefall.freebsd.org> To: am@raisa.eu.org, pjd@FreeBSD.org, freebsd-fs@FreeBSD.org, pjd@FreeBSD.org From: pjd@FreeBSD.org Cc: Subject: Re: kern/151910: [zfs] booting from raidz/raidz2 on ciss(4) doesn't work X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 15:15:51 -0000 Synopsis: [zfs] booting from raidz/raidz2 on ciss(4) doesn't work State-Changed-From-To: open->feedback State-Changed-By: pjd State-Changed-When: pon 8 lis 2010 15:08:24 UTC State-Changed-Why: Could you take a look at two files in FreeBSD HEAD: sys/boot/i386/zfsboot/zfsboot.c sys/boot/i386/libi386/biosdisk.c Look for VIRTUALBOX in there and apply the same changes to your stable/8 code or just modify the code to use code that is compiled with VIRTUALBOX defined. There is a bug in VirtualBox that the BIOS reports only one disk available, but if you ignore that and just look for more, you will find them. Maybe there is a similar bug in your BIOS? Please try it out and let me know. If it won't work we ca add more debug to see where and why it fails exactly. Responsible-Changed-From-To: freebsd-fs->pjd Responsible-Changed-By: pjd Responsible-Changed-When: pon 8 lis 2010 15:08:24 UTC Responsible-Changed-Why: I'll take this one. http://www.freebsd.org/cgi/query-pr.cgi?pr=151910 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 15:29:11 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B31C106566C; Mon, 8 Nov 2010 15:29:11 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4D4D68FC18; Mon, 8 Nov 2010 15:29:10 +0000 (UTC) Received: by pzk12 with SMTP id 12so437004pzk.13 for ; Mon, 08 Nov 2010 07:29:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:cc:content-type; bh=rYmgPQS2EXgeBYxtrofZIzZkdPKATIa4SCiEhDI8wuU=; b=D+xoy9r5b2bQY2gvqrG/XDCaKjj3BPMh9pQbHUhfjxMeur4oj861hLBR0ep/xVQw5p CiRK/ipzX3o8hjnwFyOwsdj2YronuA+P3PRTMxOOeR6EMCHF7hFW58b0pmnRIF4WOsas Kbnnm+dazRAGJ0X+HwKdKgVBCHSRkg7IPR3eo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; b=Ft2W1uvTBhJrM8vr3z79spSP8jR/SdyuMc/kMO/3b2OQMHoVY6q1SWnW5hQj5rVuqN Vb4pfLuDcWIp0zgsiEi9QYelAB6cg7tIDHCFtAvpwnm7qVPv1SUXifcjvxa2cJC4L5pl wMAFhwPhfoEfPKfeEvxHRX4+X5qwW7ZRmxC5Y= Received: by 10.229.212.5 with SMTP id gq5mr5139312qcb.275.1289230150158; Mon, 08 Nov 2010 07:29:10 -0800 (PST) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.229.40.145 with HTTP; Mon, 8 Nov 2010 07:28:29 -0800 (PST) In-Reply-To: <20101108151028.GI2392@deviant.kiev.zoral.com.ua> References: <201011081004.59640.jhb@freebsd.org> <20101108151028.GI2392@deviant.kiev.zoral.com.ua> From: Ivan Voras Date: Mon, 8 Nov 2010 16:28:29 +0100 X-Google-Sender-Auth: ItHWOQKCrhtLyU7It-eoPpq9uN4 Message-ID: To: Kostik Belousov Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 15:29:11 -0000 On 8 November 2010 16:10, Kostik Belousov wrote: > I already claimed several times that I will remove VFS_LOCK_GIANT > after smbfs is locked. Patch for removal is sitting in my repository > for almost a year. Ok, I've made a little table here: http://wiki.freebsd.org/MPSAFE_VFS Just to get my understanding clear on this: in the case of VFS, Giant protects / prevents concurrent execution of all vfsops or something more / something other than that? From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 16:02:57 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08A881065674; Mon, 8 Nov 2010 16:02:57 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D2B958FC15; Mon, 8 Nov 2010 16:02:56 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA8G2uuS004790; Mon, 8 Nov 2010 16:02:56 GMT (envelope-from jh@freefall.freebsd.org) Received: (from jh@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA8G2ugq004786; Mon, 8 Nov 2010 16:02:56 GMT (envelope-from jh) Date: Mon, 8 Nov 2010 16:02:56 GMT Message-Id: <201011081602.oA8G2ugq004786@freefall.freebsd.org> To: anatoly.borodin@gmail.com, jh@FreeBSD.org, freebsd-fs@FreeBSD.org From: jh@FreeBSD.org Cc: Subject: Re: bin/124424: [zfs] zfs(8): zfs list -r shows strange snapshots' size X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 16:02:57 -0000 Synopsis: [zfs] zfs(8): zfs list -r shows strange snapshots' size State-Changed-From-To: feedback->closed State-Changed-By: jh State-Changed-When: Mon Nov 8 16:02:56 UTC 2010 State-Changed-Why: Feedback timeout. http://www.freebsd.org/cgi/query-pr.cgi?pr=124424 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 16:32:27 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F377106566C for ; Mon, 8 Nov 2010 16:32:27 +0000 (UTC) (envelope-from monthadar@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0734E8FC0C for ; Mon, 8 Nov 2010 16:32:26 +0000 (UTC) Received: by pzk12 with SMTP id 12so497681pzk.13 for ; Mon, 08 Nov 2010 08:32:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=kWS+q2CIU2lyQgKhVEyK9bL1WqM66n/QNaVYFYvGShs=; b=ZhrAMS7FF/4WSvs3lV3a6GJi3pXRH3x/RlLCOR/A6w6HlBmEAMywYMbwBPiwsyZbhB E4ZFDW8IGmV1FTS2Bz/ZnuijwFhfLNl9OHLFdyfj3NVYW+QFesWr2ze/IY4uwdCGL1zi EwlR5Cgfta+UjPcEe6iFLmDsCi2oGMvbnvj0Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=B5OiWTEV6gjYW//duRaGd8zt2BUmZ7afSAkpqp1UcyIYKo8Q9KnKihmjvxRbKOYEQ3 OT8+EuttQb6jmLZ1+MHeGgHZt7JmgXfPPwDrcrjJKy0RRXw0DTW9IuEfwVDB5jCTsPun HxixWKYDCSs7SXXvuR+JD/U/CUCgRl6nHtEbQ= MIME-Version: 1.0 Received: by 10.229.225.199 with SMTP id it7mr5377936qcb.33.1289232617849; Mon, 08 Nov 2010 08:10:17 -0800 (PST) Received: by 10.229.182.77 with HTTP; Mon, 8 Nov 2010 08:10:17 -0800 (PST) Date: Mon, 8 Nov 2010 17:10:17 +0100 Message-ID: From: Monthadar Al Jaberi To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: problem mounting from flash [Invalid sectorsize] [g_vfs_done() error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 16:32:27 -0000 Hi, I dont know if I am asking on the wrong place. But it has todo with filesystem and onboard flash (16MB) on a RouterStation Pro board. I am running a FreeBSD Current 201010, with the kernel configuration file specified in /usr/src/sys/mips/conf/AR71XX with device geom_redboot. but I get this error when I try to mount from flash: mount /dev/redboot/fs /var/fs mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size 8192: Invalid argument So I guessed it has todo with the flash configured in 64k sectors according to the boot output. ... mx25l0: at cs 0 on spibus0 mx25l0: mx25ll128, sector 65536 bytes, 256 sectors ... So I just tried to change SBLOCKSIZE from 8129 to 65536 in /usr/src/sys/ufs/ffs/fs.h, but then I got this error: mount /dev/redboot/fs /mnt/fs g_vfs_done():redboot/fs[READ(offset=8192, length=65536)]error = 22 mount: /dev/redboot/fs : Invalid argument The filesystem is generated from an empty skeleton using: makefs -t ffs -B big -s 128k image-name directory-path Then I transfer the image to the flash using Redboot bootloader. Am I generating an incorrect filesystem image? I dont understand offset and length in the last error message. I couldnt use cat to dump the content in /dev/redboot/fs gives an invalid argument error. But I can use read(fd, buf, 65536) to read data. Has to be 64k (hint from http://wiki.freebsd.org/AdrianChadd/UbiquityRouterstationPro). Any help is much appreciated. -- //Monthadar Al Jaberi From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 17:41:56 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 852471065673; Mon, 8 Nov 2010 17:41:56 +0000 (UTC) (envelope-from sarawgi.aditya@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4744D8FC1A; Mon, 8 Nov 2010 17:41:56 +0000 (UTC) Received: by pwj5 with SMTP id 5so53605pwj.13 for ; Mon, 08 Nov 2010 09:41:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=UEkQ27ifyK9gVdHmvIkulyOdojh5gOSz4OrW8Vy+kxs=; b=EcdO/CPPKVdQNU7fC0zc5Pu/nT8X/MVirF0q4P+l/G7I3VtPmpGEPsIV2EKmQqgXWd 3ZCLIO/s13i8WLWok4CDmA1E1hfJRLC5P2Ar9kQUiWoQ9AsO34YcXW6+Vrz59icSasoz a4Z+pyorIcvY6JvSB1vCqcHNCkTKVkHTSk7Fs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=v303C99e1ZdPfVwapyBaD804PoTQLm5Li4QF0FtowSgRBHuoqu918rhxthqvLShZNE P8TgZa+tx/4JXHYxn4Wzyeyi2rOd1HL57158K8Rlcn4ZvTz3JObRFTIHyodRNGfW1GZu X1KrgVYdT/zXeSXwPxmJLkzKHoKIG3IiYR4/c= Received: by 10.142.237.4 with SMTP id k4mr5118835wfh.171.1289238115655; Mon, 08 Nov 2010 09:41:55 -0800 (PST) Received: from earth ([183.87.49.109]) by mx.google.com with ESMTPS id x35sm175623wfd.1.2010.11.08.09.41.52 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 08 Nov 2010 09:41:54 -0800 (PST) Date: Mon, 8 Nov 2010 23:13:32 +0530 From: Aditya Sarawgi To: Doug Barton Message-ID: <20101108174327.GC2066@earth> References: <20100929031825.L683@besplex.bde.org> <20100929084801.M948@besplex.bde.org> <20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org> <20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org> <4CD201AE.3040409@FreeBSD.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="BXVAT5kNtrzKuDFl" Content-Disposition: inline In-Reply-To: <4CD201AE.3040409@FreeBSD.org> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: fs@freebsd.org Subject: Re: ext2fs now extremely slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 17:41:56 -0000 --BXVAT5kNtrzKuDFl Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Nov 03, 2010 at 05:43:26PM -0700, Doug Barton wrote: > On 11/03/10 16:38, Aditya Sarawgi wrote: > > On Wed, Nov 3, 2010 at 5:21 AM, Doug Barton wrote: > > >> Is anything happening with this? I recently built a new system that is > >> multi-booting windows, freebsd, and ubuntu. I chose ext[23]fs for my /home > >> partition so that I could share unix'y stuff between freebsd and linux, but > >> I'm having both performance and stability problems, and today (fortunately > >> for the first time, and fortunately recoverable) I had actual data loss. I'm > >> happy to be a guinea pig for new code if people are reasonably sure that it > >> will help, but if the situation doesn't improve I will have to reformat. > >> > > > > Are you suffering from these problems on CURRENT ? > > Yes. > > > Can you please elaborate > > on the performance and stability issue you are facing ? Any specific scenario ? > > What I did was create a fairly large (37G) /home and put all the stuff > I'd like to have access to from all 3 systems, like svn, my ports tree, > etc. I also ended up putting my obj directory there because I created my > /usr/local a little smaller than I should have and after installing > gnome I ran out of room. :) > > I should also point out that this is on a brand new desktop system that > was donated by a FreeBSD user. It's a C2D running at 3.17G, 4G RAM, and > a fast 250G disk. I'm running amd64 -current. Everything disk intensive > (updating ports with csup, updating my svn trees, etc.) is slower on > this system than it was on my laptop where all the same stuff was on > UFS2. Bruce's message that started this thread alluded to the problems, > my experience has been similar. > > Regarding stability, sometimes (but not always) when I'm doing the above > listed disk-intensive things on an otherwise idle system I've had the > system lock up. Not panic, not reboot, just wedge. I'm running X when > this happens, so I'm not 100% sure that the disk activity is the > culprit, but it seems very suspicious. Yesterday was a very bad day, I > had to do 3 tries to get all the way through a buildworld/kernel, mostly > because the last 2 crashes resulted in my /usr/src (which is actually > /home/svn/head) and /usr/obj (/home/obj-9) directories getting corrupted > respectively. Today (running r214694) has actually been quite good, > although I haven't tried a buildworld yet. > I am not sure if this is the right use case for ext2fs > > You can test Zheng's preallocation patch for ext2fs, there is a > > serious lack of testers for that. > > I would be happy to do that, but my reading of this thread last month > didn't produce a clear "try this version of the patch" neon sign. > Various people referred to suggestions, updates, etc. If someone could > provide a URL for the right patch to try, as well as a suggestion for > benchmarking methodology, I'll be glad to do so. > I have attached the patch. Some primitive testing like copying files, untaring etc and comparing with the existing ext2fs will do. If you are looking to do a full fledged benchmarking then I would suggest iozone, blogbench, dbench etc. > >> On a related note, is there any way to use the journaling features of ext3fs > >> in FreeBSD? When I boot the linux partition it's treating the fs as ext3fs, > >> but AFAICS we only have ext2fs capabilities. > >> > > > > Journaling is difficult to bring in, especially if one is planning to > > have a BSDL version. > > Ok. I can live with accessing the stuff as ext2 from FreeBSD, and I can > even live with a minor performance penalty. What I can't live with is > instability and/or data corruption; and it should go without saying that > our users should not have to live with that either. > We were planning to use gjournal but it is too tied with UFS and it wouldn't be compatible with ext2fs journaling. Haiku seems to have journaling for ext2fs but that depends a lot on BFS journaling. Bringing in journaling code is not a option over here since they have their separate journaling layer. > > Thanks for the response, > > Doug > > -- > > Nothin' ever doesn't change, but nothin' changes much. > -- OK Go > > Breadth of IT experience, and depth of knowledge in the DNS. > Yours for the right price. :) http://SupersetSolutions.com/ > --BXVAT5kNtrzKuDFl Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="ext2fs_prealloc.diff" diff -urN /usr/src/sys/fs/ext2fs/ext2_alloc.c new/ext2_alloc.c --- /usr/src/sys/fs/ext2fs/ext2_alloc.c 2010-01-14 22:30:54.000000000 +0800 +++ new/ext2_alloc.c 2010-08-19 02:47:29.000000000 +0800 @@ -50,6 +50,9 @@ #include #include #include +#include + +#define phy_blk(cg, fs) (((cg) * (fs->e2fs->e2fs_fpg)) + fs->e2fs->e2fs_first_dblock) static daddr_t ext2_alloccg(struct inode *, int, daddr_t, int); static u_long ext2_dirpref(struct inode *); @@ -59,37 +62,524 @@ int)); static daddr_t ext2_nodealloccg(struct inode *, int, daddr_t, int); static daddr_t ext2_mapsearch(struct m_ext2fs *, char *, daddr_t); + +/* For reservation window */ +static u_long ext2_alloc_blk(struct inode *, int, struct buf *, int32_t, struct ext2_rsv_win *); +static int ext2_alloc_new_rsv(struct inode *, int, struct buf *, int32_t); +static int ext2_bpref_in_rsv(struct ext2_rsv_win *, int32_t); +static int ext2_find_rsv(struct ext2_rsv_win *, struct ext2_rsv_win *, + struct m_ext2fs *, int32_t, int); +static void ext2_remove_rsv_win(struct m_ext2fs *, struct ext2_rsv_win *); +static u_long ext2_rsvalloc(struct m_ext2fs *, struct inode *, + int, struct buf *, int32_t, int); +static daddr_t ext2_search_next_block(struct m_ext2fs *, char *, int, int); +static struct ext2_rsv_win *ext2_search_rsv(struct ext2_rsv_win_tree *, int32_t); + +RB_GENERATE(ext2_rsv_win_tree, ext2_rsv_win, rsv_link, ext2_rsv_win_cmp); + /* * Allocate a block in the file system. * - * A preference may be optionally specified. If a preference is given - * the following hierarchy is used to allocate a block: - * 1) allocate the requested block. - * 2) allocate a rotationally optimal block in the same cylinder. - * 3) allocate a block in the same cylinder group. - * 4) quadradically rehash into other cylinder groups, until an - * available block is located. - * If no block preference is given the following hierarchy is used - * to allocate a block: - * 1) allocate a block in the cylinder group that contains the - * inode for the file. - * 2) quadradically rehash into other cylinder groups, until an - * available block is located. - * - * A preference may be optionally specified. If a preference is given - * the following hierarchy is used to allocate a block: - * 1) allocate the requested block. - * 2) allocate a rotationally optimal block in the same cylinder. - * 3) allocate a block in the same cylinder group. - * 4) quadradically rehash into other cylinder groups, until an - * available block is located. - * If no block preference is given the following hierarchy is used - * to allocate a block: - * 1) allocate a block in the cylinder group that contains the - * inode for the file. - * 2) quadradically rehash into other cylinder groups, until an - * available block is located. + * By given preference: + * Check whether inode has a reservation window and preference + * is within it and try to allocate a free block from + * this reservation window. + * If not, traverse RB tree to find a place, which is not in + * any window and insert it to RB tree to try to allocate a + * free block again. + * If it fails, try to allocate a free block in other cylinder + * groups without preference. + */ + +/* + * Allocate a free block. + * + * First check whether reservation window is used. + * If reservation window is used, try to allocate a free + * block from the reservation window. If it fails, traverse + * the bitmap to find a free block. + * If reservation window is not used, try to allocate + * a free block by bpref. If it fails, traverse the bitmap + * to find a free block. */ +static u_long +ext2_alloc_blk(struct inode *ip, int cg, struct buf *bp, + int32_t bpref, struct ext2_rsv_win *rp) +{ + struct m_ext2fs *fs; + struct ext2mount *ump; + int bno, start, end; + char *bbp; + + fs = ip->i_e2fs; + ump = ip->i_ump; + bbp = (char *)bp->b_data; + + if (fs->e2fs_gd[cg].ext2bgd_nbfree == 0) + return (0); + + if (bpref < 0) + bpref = 0; + + /* Check whether it use reservation window */ + if (rp != NULL) { + /* + * If window's start is not in this cylinder group, + * try to allocate from the beginning, otherwise + * try to allocate from the beginning of the + * window. + */ + if (dtog(fs, rp->rsv_start) < cg) + start = 0; + else + start = rp->rsv_start; + + /* + * If window's end crosses the end of this group, + * set end variable to the end of this group. + * Otherwise, set it to the window's end. + */ + if (dtog(fs, rp->rsv_end) > cg) + end = phy_blk(cg + 1, fs) - 1; + else + end = rp->rsv_end; + + /* If preference block is within the window, try to allocate it. */ + if (start <= bpref && bpref <= end) { + bpref = dtogd(fs, bpref); + if (isclr(bbp, bpref)) { + rp->rsv_alloc_hit++; + bno = bpref; + goto gotit; + } + } else + if (dtog(fs, rp->rsv_start) == cg) + bpref = dtogd(fs, rp->rsv_start); + else + bpref = 0; + } else { + if (dtog(fs, bpref) != cg) + bpref = 0; + if (bpref != 0) { + bpref = dtogd(fs, bpref); + if (isclr(bbp, bpref)) { + bno = bpref; + goto gotit; + } + } + } + + bno = ext2_mapsearch(fs, bbp, bpref); + if (bno < 0) + return (0); + +gotit: + setbit(bbp, (daddr_t)bno); + EXT2_LOCK(ump); + fs->e2fs->e2fs_fbcount--; + fs->e2fs_gd[cg].ext2bgd_nbfree--; + fs->e2fs_fmod = 1; + EXT2_UNLOCK(ump); + bdwrite(bp); + bno = phy_blk(cg, fs) + bno; + return (bno); +} + +/* + * Initialize reservation window per inode. + */ +void +ext2_init_rsv(struct inode *ip) +{ + struct ext2_rsv_win *rp; + + rp = malloc(sizeof(struct ext2_rsv_win), + M_EXT2NODE, M_WAITOK | M_ZERO); + + /* + * If malloc failed, we just do not use the + * reservation window mechanism. + */ + if (rp == NULL) + return; + + rp->rsv_start = EXT2_RSV_NOT_ALLOCATED; + rp->rsv_end = EXT2_RSV_NOT_ALLOCATED; + + rp->rsv_goal_size = EXT2_RSV_DEFAULT_RESERVE_BLKS; + rp->rsv_alloc_hit = 0; + + ip->i_rsv = rp; +} + +/* + * Discard reservation window. + * + * It is called during the following situations: + * 1. free an inode + * 2. sync inode + * 3. truncate a file + */ +void +ext2_discard_rsv(struct inode *ip) +{ + struct ext2_rsv_win *rp; + + if (ip->i_rsv == NULL) + return; + + rp = ip->i_rsv; + + /* If reservation window is empty, nothing to do */ + if (rp->rsv_end == EXT2_RSV_NOT_ALLOCATED) + return; + + EXT2_TREE_LOCK(ip->i_e2fs); + ext2_remove_rsv_win(ip->i_e2fs, rp); + EXT2_TREE_UNLOCK(ip->i_e2fs); + rp->rsv_goal_size = EXT2_RSV_DEFAULT_RESERVE_BLKS; +} + +/* + * Remove a ext2_rsv_win structure from RB tree. + */ +static void +ext2_remove_rsv_win(struct m_ext2fs *fs, struct ext2_rsv_win *rp) +{ + RB_REMOVE(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rp); + rp->rsv_start = EXT2_RSV_NOT_ALLOCATED; + rp->rsv_end = EXT2_RSV_NOT_ALLOCATED; + rp->rsv_alloc_hit = 0; +} + +/* + * Check bpref is in the reservation window. + */ +static int +ext2_bpref_in_rsv(struct ext2_rsv_win *rp, int32_t bpref) +{ + if (bpref >= 0 && (bpref < rp->rsv_start || bpref > rp->rsv_end)) + return (0); + + return (1); +} + +/* + * Search a tree node from RB tree. It includes the bpref or + * the previous one if bpref is not in any window. + */ +static struct ext2_rsv_win * +ext2_search_rsv(struct ext2_rsv_win_tree *root, int32_t start) +{ + struct ext2_rsv_win *prev, *next; + + if (RB_EMPTY(root)) + return (NULL); + + next = RB_ROOT(root); + do { + prev = next; + if (start < next->rsv_start) + next = RB_LEFT(next, rsv_link); + else if (start > next->rsv_end) + next = RB_RIGHT(next, rsv_link); + else + return (next); + } while (next != NULL); + + if (prev->rsv_start > start) { + next = RB_PREV(ext2_rsv_win_tree, root, prev); + if (next != NULL) + prev = next; + } + + return (prev); +} + +/* + * Find a reservation window by given range from start to + * the end of this cylinder group. + */ +static int +ext2_find_rsv(struct ext2_rsv_win *search, struct ext2_rsv_win *rp, + struct m_ext2fs *fs, int32_t start, int cg) +{ + struct ext2_rsv_win *rsv, *prev; + int32_t cur; + int size = rp->rsv_goal_size; + + if (search == NULL) { + rp->rsv_start = start & ~7; + rp->rsv_end = start + size - 1; + rp->rsv_alloc_hit = 0; + + RB_INSERT(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rp); + + return (0); + } + + /* + * Make the start of reservation window byte-aligned + * in order to can find a free block with bit operations + * in the ext2_search_next_block() function. + */ + cur = start & ~7; + rsv = search; + prev = NULL; + + while (1) { + if (cur <= rsv->rsv_end) + cur = rsv->rsv_end + 1; + + if (dtog(fs, cur) != cg) + return (-1); + + prev = rsv; + rsv = RB_NEXT(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rsv); + + if (rsv == NULL) + break; + + if (cur + size <= rsv->rsv_start) + break; + } + + if (prev != rp && rp->rsv_end != EXT2_RSV_NOT_ALLOCATED) + ext2_remove_rsv_win(fs, rp); + + rp->rsv_start = cur; + rp->rsv_end = cur + size - 1; + rp->rsv_alloc_hit = 0; + + if (prev != rp) + RB_INSERT(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rp); + + return (0); +} + +/* + * Find a free block by given range from bpref to + * the end of this cylinder group. + */ +static daddr_t +ext2_search_next_block(struct m_ext2fs *fs, char *bbp, int bpref, int cg) +{ + daddr_t bno; + int start, loc, len, map, i; + + start = bpref / NBBY; + len = howmany(fs->e2fs->e2fs_fpg, NBBY) - start; + loc = skpc(0xff, len, &bbp[start]); + if (loc == 0) + return (-1); + + i = start + len - loc; + map = bbp[i]; + bno = i * NBBY; + for (i = 1; i < (1 << NBBY); i <<= 1, bno++) { + if ((map & i) == 0) + return (bno); + } + + return (-1); +} + +/* + * Allocate a new reservation window. + */ +static int +ext2_alloc_new_rsv(struct inode *ip, int cg, struct buf *bp, int32_t bpref) +{ + struct m_ext2fs *fs; + struct ext2_rsv_win *rp, *search; + char *bbp; + int start, size, ret; + + fs = ip->i_e2fs; + rp = ip->i_rsv; + bbp = bp->b_data; + size = rp->rsv_goal_size; + + if (bpref <= 0) + start = phy_blk(cg, fs); + else + start = bpref; + + /* Dynamically increase the size of window */ + if (rp->rsv_end != EXT2_RSV_NOT_ALLOCATED) { + if (rp->rsv_alloc_hit > + ((rp->rsv_end - rp->rsv_start + 1) / 2)) { + size = size * 2; + if (size > EXT2_RSV_MAX_RESERVE_BLKS) + size = EXT2_RSV_MAX_RESERVE_BLKS; + rp->rsv_goal_size = size; + } + } + + EXT2_TREE_LOCK(fs); + + search = ext2_search_rsv(fs->e2fs_rsv_tree, start); + +repeat: + ret = ext2_find_rsv(search, rp, fs, start, cg); + if (ret < 0) { + if (rp->rsv_end != EXT2_RSV_NOT_ALLOCATED) + ext2_remove_rsv_win(fs, rp); + EXT2_TREE_UNLOCK(fs); + return (-1); + } + EXT2_TREE_UNLOCK(fs); + + start = dtogd(fs, rp->rsv_start); + start = ext2_search_next_block(fs, bbp, start, cg); + if (start < 0) { + EXT2_TREE_LOCK(fs); + if (rp->rsv_end != EXT2_RSV_NOT_ALLOCATED) + ext2_remove_rsv_win(fs, rp); + EXT2_TREE_UNLOCK(fs); + return (-1); + } + + start = phy_blk(cg, fs) + start; + if (start >= rp->rsv_start && start <= rp->rsv_end) + return (0); + + search = rp; + EXT2_TREE_LOCK(fs); + goto repeat; +} + +/* + * Allocate a free block from reservation window. + */ +static u_long +ext2_rsvalloc(struct m_ext2fs *fs, struct inode *ip, int cg, + struct buf *bp, int32_t bpref, int size) +{ + struct ext2_rsv_win *rp; + int ret; + + rp = ip->i_rsv; + if (rp == NULL) + return (ext2_alloc_blk(ip, cg, bp, bpref, NULL)); + + if (rp->rsv_end == EXT2_RSV_NOT_ALLOCATED || + !ext2_bpref_in_rsv(rp, bpref)) { + ret = ext2_alloc_new_rsv(ip, cg, bp, bpref); + if (ret < 0) + return (0); + } + + return (ext2_alloc_blk(ip, cg, bp, bpref, rp)); +} + +/* + * Allocate a block using reservation window in ext2 file system. + * + * NOTE: This function will replace the ext2_alloc() function. + */ +int +ext2_alloc_rsv(struct inode *ip, int32_t lbn, int32_t bpref, + int size, struct ucred *cred, int32_t *bnp) +{ + struct m_ext2fs *fs; + struct ext2mount *ump; + struct buf *bp; + int32_t bno = 0; + int i, cg, error; + + *bnp = 0; + fs = ip->i_e2fs; + ump = ip->i_ump; + mtx_assert(EXT2_MTX(ump), MA_OWNED); + + if (size == fs->e2fs_bsize && fs->e2fs->e2fs_fbcount == 0) + goto nospace; + if (cred->cr_uid != 0 && + fs->e2fs->e2fs_fbcount < fs->e2fs->e2fs_rbcount) + goto nospace; + + if (bpref >= fs->e2fs->e2fs_bcount) + bpref = 0; + if (bpref == 0) + cg = ino_to_cg(fs, ip->i_number); + else + cg = dtog(fs, bpref); + + /* If cg has some free blocks, then try to allocate a free block from this cg */ + if (fs->e2fs_gd[cg].ext2bgd_nbfree > 0) { + /* Read block bitmap from buffer */ + EXT2_UNLOCK(ump); + error = bread(ip->i_devvp, + fsbtodb(fs, fs->e2fs_gd[cg].ext2bgd_b_bitmap), + (int)fs->e2fs_bsize, NOCRED, &bp); + if (error) { + brelse(bp); + goto ioerror; + } + + EXT2_RSV_LOCK(ip); + /* Try to allocate from reservation window */ + bno = ext2_rsvalloc(fs, ip, cg, bp, bpref, size); + EXT2_RSV_UNLOCK(ip); + if (bno > 0) + goto allocated; + + brelse(bp); + EXT2_LOCK(ump); + } + + /* Just need to try to allocate a free block from rest groups. */ + cg = (cg + 1) % fs->e2fs_gcount; + for (i = 1; i < fs->e2fs_gcount; i++) { + if (fs->e2fs_gd[cg].ext2bgd_nbfree > 0) { + /* Read block bitmap from buffer */ + EXT2_UNLOCK(ump); + error = bread(ip->i_devvp, + fsbtodb(fs, fs->e2fs_gd[cg].ext2bgd_b_bitmap), + (int)fs->e2fs_bsize, NOCRED, &bp); + if (error) { + brelse(bp); + goto ioerror; + } + + EXT2_RSV_LOCK(ip); + bno = ext2_rsvalloc(fs, ip, cg, bp, -1, size); + EXT2_RSV_UNLOCK(ip); + if (bno > 0) + goto allocated; + + brelse(bp); + EXT2_LOCK(ump); + } + + cg++; + if (cg == fs->e2fs_gcount) + cg = 0; + } + +allocated: + if (bno > 0) { + ip->i_next_alloc_block = lbn; + ip->i_next_alloc_goal = bno; + + ip->i_blocks += btodb(fs->e2fs_bsize); + ip->i_flag |= IN_CHANGE | IN_UPDATE; + *bnp = bno; + return (0); + } + +nospace: + EXT2_UNLOCK(ump); + ext2_fserr(fs, cred->cr_uid, "file system full"); + uprintf("\n%s: write failed, file system is full\n", fs->e2fs_fsmnt); + return (ENOSPC); + +ioerror: + ext2_fserr(fs, cred->cr_uid, "file system IO error"); + uprintf("\n%s: write failed, file system IO error\n", fs->e2fs_fsmnt); + return (EIO); +} int ext2_alloc(ip, lbn, bpref, size, cred, bnp) @@ -923,9 +1413,11 @@ start = 0; loc = skpc(0xff, len, &bbp[start]); if (loc == 0) { - printf("start = %d, len = %d, fs = %s\n", - start, len, fs->e2fs_fsmnt); - panic("ext2fs_alloccg: map corrupted"); + /* XXX: just for reservation window */ + return -1; + /*printf("start = %d, len = %d, fs = %s\n",*/ + /*start, len, fs->e2fs_fsmnt);*/ + /*panic("ext2fs_alloccg: map corrupted");*/ /* NOTREACHED */ } } diff -urN /usr/src/sys/fs/ext2fs/ext2_balloc.c new/ext2_balloc.c --- /usr/src/sys/fs/ext2fs/ext2_balloc.c 2010-01-14 22:30:54.000000000 +0800 +++ new/ext2_balloc.c 2010-08-19 02:47:29.000000000 +0800 @@ -49,6 +49,7 @@ #include #include #include +#include /* * Balloc defines the structure of file system storage * by allocating the physical blocks on a device given @@ -78,6 +79,9 @@ fs = ip->i_e2fs; ump = ip->i_ump; + if (ip->i_rsv == NULL) + ext2_init_rsv(ip); + /* * check if this is a sequential block allocation. * If so, increment next_alloc fields to allow ext2_blkpref @@ -136,9 +140,9 @@ else nsize = fs->e2fs_bsize; EXT2_LOCK(ump); - error = ext2_alloc(ip, lbn, - ext2_blkpref(ip, lbn, (int)lbn, &ip->i_db[0], 0), - nsize, cred, &newb); + error = ext2_alloc_rsv(ip, lbn, + ext2_blkpref(ip, lbn, (int)lbn, &ip->i_db[0], 0), + nsize, cred, &newb); if (error) return (error); bp = getblk(vp, lbn, nsize, 0, 0, 0); @@ -170,9 +174,9 @@ EXT2_LOCK(ump); pref = ext2_blkpref(ip, lbn, indirs[0].in_off + EXT2_NDIR_BLOCKS, &ip->i_db[0], 0); - if ((error = ext2_alloc(ip, lbn, pref, - (int)fs->e2fs_bsize, cred, &newb))) - return (error); + if ((error = ext2_alloc_rsv(ip, lbn, pref, + (int)fs->e2fs_bsize, cred, &newb))) + return (error); nb = newb; bp = getblk(vp, indirs[1].in_lbn, fs->e2fs_bsize, 0, 0, 0); bp->b_blkno = fsbtodb(fs, newb); @@ -211,7 +215,7 @@ if (pref == 0) pref = ext2_blkpref(ip, lbn, indirs[i].in_off, bap, bp->b_lblkno); - error = ext2_alloc(ip, lbn, pref, (int)fs->e2fs_bsize, cred, &newb); + error = ext2_alloc_rsv(ip, lbn, pref, (int)fs->e2fs_bsize, cred, &newb); if (error) { brelse(bp); return (error); @@ -250,8 +254,8 @@ EXT2_LOCK(ump); pref = ext2_blkpref(ip, lbn, indirs[i].in_off, &bap[0], bp->b_lblkno); - if ((error = ext2_alloc(ip, - lbn, pref, (int)fs->e2fs_bsize, cred, &newb)) != 0) { + if ((error = ext2_alloc_rsv(ip, lbn, pref, + (int)fs->e2fs_bsize, cred, &newb)) != 0) { brelse(bp); return (error); } diff -urN /usr/src/sys/fs/ext2fs/ext2_inode.c new/ext2_inode.c --- /usr/src/sys/fs/ext2fs/ext2_inode.c 2010-01-14 22:30:54.000000000 +0800 +++ new/ext2_inode.c 2010-08-19 02:47:29.000000000 +0800 @@ -52,6 +52,7 @@ #include #include #include +#include static int ext2_indirtrunc(struct inode *, int32_t, int32_t, int32_t, int, long *); @@ -153,6 +154,11 @@ } fs = oip->i_e2fs; osize = oip->i_size; + + EXT2_RSV_LOCK(oip); + ext2_discard_rsv(oip); + EXT2_RSV_UNLOCK(oip); + /* * Lengthen the size of the file. We must ensure that the * last byte of the file is allocated. Since the smallest @@ -484,6 +490,10 @@ if (prtactive && vrefcnt(vp) != 0) vprint("ext2_inactive: pushing active", vp); + EXT2_RSV_LOCK(ip); + ext2_discard_rsv(ip); + EXT2_RSV_UNLOCK(ip); + /* * Ignore inodes related to stale file handles. */ @@ -525,11 +535,21 @@ if (prtactive && vrefcnt(vp) != 0) vprint("ufs_reclaim: pushing active", vp); ip = VTOI(vp); + if (ip->i_flag & IN_LAZYMOD) { ip->i_flag |= IN_MODIFIED; ext2_update(vp, 0); } vfs_hash_remove(vp); + + EXT2_RSV_LOCK(ip); + if (ip->i_rsv != NULL) { + free(ip->i_rsv, M_EXT2NODE); + ip->i_rsv = NULL; + } + EXT2_RSV_UNLOCK(ip); + mtx_destroy(&ip->i_rsv_lock); + free(vp->v_data, M_EXT2NODE); vp->v_data = 0; vnode_destroy_vobject(vp); diff -urN /usr/src/sys/fs/ext2fs/ext2_rsv_win.h new/ext2_rsv_win.h --- /usr/src/sys/fs/ext2fs/ext2_rsv_win.h 1970-01-01 08:00:00.000000000 +0800 +++ new/ext2_rsv_win.h 2010-08-19 02:47:29.000000000 +0800 @@ -0,0 +1,78 @@ +/*- + * Copyright (c) 2010, 2010 Zheng Liu + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD: src/sys/fs/ext2fs/ext2_rsv_win.h,v 0.1 2010/05/08 12:41:51 lz Exp $ + */ +#ifndef _FS_EXT2FS_EXT2_RSV_WIN_H_ +#define _FS_EXT2FS_EXT2_RSV_WIN_H_ + +#include + +#define EXT2_RSV_DEFAULT_RESERVE_BLKS 8 +#define EXT2_RSV_MAX_RESERVE_BLKS 1024 +#define EXT2_RSV_NOT_ALLOCATED 0 + +#define EXT2_RSV_LOCK(ip) mtx_lock(&ip->i_rsv_lock) +#define EXT2_RSV_UNLOCK(ip) mtx_unlock(&ip->i_rsv_lock) + +#define EXT2_TREE_LOCK(fs) mtx_lock(&fs->e2fs_rsv_lock); +#define EXT2_TREE_UNLOCK(fs) mtx_unlock(&fs->e2fs_rsv_lock); + +/* + * Reservation window entry + */ +struct ext2_rsv_win { + RB_ENTRY(ext2_rsv_win) rsv_link; /* RB tree links */ + + int32_t rsv_goal_size; /* Default reservation window size */ + int32_t rsv_alloc_hit; /* Number of allocated windows */ + + int32_t rsv_start; /* First bytes of window */ + int32_t rsv_end; /* End bytes of window */ +}; + +RB_HEAD(ext2_rsv_win_tree, ext2_rsv_win); + +static __inline int +ext2_rsv_win_cmp(const struct ext2_rsv_win *a, + const struct ext2_rsv_win *b) +{ + if (a->rsv_start < b->rsv_start) + return (-1); + if (a->rsv_start == b->rsv_start) + return (0); + + return (1); +} +RB_PROTOTYPE(ext2_rsv_win_tree, ext2_rsv_win, rsv_link, ext2_rsv_win_cmp); + +/* predefine */ +struct inode; +/* ext2_alloc.c */ +void ext2_init_rsv(struct inode *ip); +void ext2_discard_rsv(struct inode *ip); +int ext2_alloc_rsv(struct inode *, int32_t, int32_t, int, struct ucred *, int32_t *); + +#endif /* !_FS_EXT2FS_EXT2_RSV_WIN_H_ */ diff -urN /usr/src/sys/fs/ext2fs/ext2_vfsops.c new/ext2_vfsops.c --- /usr/src/sys/fs/ext2fs/ext2_vfsops.c 2010-01-14 22:30:54.000000000 +0800 +++ new/ext2_vfsops.c 2010-08-19 02:47:29.000000000 +0800 @@ -1,4 +1,4 @@ -/*- +/* * modified for EXT2FS support in Lites 1.1 * * Aug 1995, Godmar Back (gback@cs.utah.edu) @@ -61,6 +61,7 @@ #include #include #include +#include static int ext2_flushfiles(struct mount *mp, int flags, struct thread *td); static int ext2_mountfs(struct vnode *, struct mount *); @@ -95,9 +96,9 @@ static int compute_sb_data(struct vnode * devvp, struct ext2fs * es, struct m_ext2fs * fs); -static const char *ext2_opts[] = { "from", "export", "acls", "noexec", - "noatime", "union", "suiddir", "multilabel", "nosymfollow", - "noclusterr", "noclusterw", "force", NULL }; +static const char *ext2_opts[] = { "acls", "async", "export", "force", + "from", "multilabel", "noatime", "noclusterr", "noclusterw", + "noexec", "nosymfollow", "suiddir", "union", NULL }; /* * VFS Operations. @@ -581,6 +582,14 @@ if ((error = compute_sb_data(devvp, ump->um_e2fs->e2fs, ump->um_e2fs))) goto out; + /* Initial reservation window index and lock */ + bzero(&ump->um_e2fs->e2fs_rsv_lock, sizeof(struct mtx)); + mtx_init(&ump->um_e2fs->e2fs_rsv_lock, + "rsv tree lock", NULL, MTX_DEF); + ump->um_e2fs->e2fs_rsv_tree = malloc(sizeof(struct ext2_rsv_win_tree), + M_EXT2MNT, M_WAITOK | M_ZERO); + RB_INIT(ump->um_e2fs->e2fs_rsv_tree); + brelse(bp); bp = NULL; fs = ump->um_e2fs; @@ -680,6 +689,8 @@ g_topology_unlock(); PICKUP_GIANT(); vrele(ump->um_devvp); + free(fs->e2fs_rsv_tree, M_EXT2MNT); + mtx_destroy(&fs->e2fs_rsv_lock); free(fs->e2fs_gd, M_EXT2MNT); free(fs->e2fs_contigdirs, M_EXT2MNT); free(fs->e2fs, M_EXT2MNT); @@ -919,6 +930,10 @@ ip->i_prealloc_count = 0; ip->i_prealloc_block = 0; + bzero(&ip->i_rsv_lock, sizeof(struct mtx)); + mtx_init(&ip->i_rsv_lock, "inode rsv lock", NULL, MTX_DEF); + ip->i_rsv = NULL; + /* * Now we want to make sure that block pointers for unused * blocks are zeroed out - ext2_balloc depends on this diff -urN /usr/src/sys/fs/ext2fs/ext2fs.h new/ext2fs.h --- /usr/src/sys/fs/ext2fs/ext2fs.h 2010-01-14 22:30:54.000000000 +0800 +++ new/ext2fs.h 2010-08-19 02:47:29.000000000 +0800 @@ -38,6 +38,7 @@ #define _FS_EXT2FS_EXT2_FS_H #include +#include /* * Special inode numbers @@ -174,6 +175,9 @@ char e2fs_wasvalid; /* valid at mount time */ off_t e2fs_maxfilesize; struct ext2_gd *e2fs_gd; /* Group Descriptors */ + + struct mtx e2fs_rsv_lock; /* Protect reservation window RB tree */ + struct ext2_rsv_win_tree *e2fs_rsv_tree; /* Reservation window index */ }; /* diff -urN /usr/src/sys/fs/ext2fs/inode.h new/inode.h --- /usr/src/sys/fs/ext2fs/inode.h 2010-01-14 22:30:54.000000000 +0800 +++ new/inode.h 2010-08-19 02:47:29.000000000 +0800 @@ -100,6 +100,10 @@ int32_t i_gen; /* Generation number. */ u_int32_t i_uid; /* File owner. */ u_int32_t i_gid; /* File group. */ + + /* Fields for reservation window */ + struct mtx i_rsv_lock; /* Protects i_rsv */ + struct ext2_rsv_win *i_rsv; /* Reservation window */ }; /* --BXVAT5kNtrzKuDFl-- From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 17:42:06 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C80361065670; Mon, 8 Nov 2010 17:42:06 +0000 (UTC) (envelope-from sarawgi.aditya@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6706D8FC1C; Mon, 8 Nov 2010 17:42:06 +0000 (UTC) Received: by vws20 with SMTP id 20so304346vws.13 for ; Mon, 08 Nov 2010 09:42:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=mVm82mMOju9/lRR1t+H4KvygCcVHzO9IC7P0G4Xoec0=; b=mZl+1r8gul7aUFomCPoCIrBmGirOVQbTSCNUwtQDVYK9xVDch6YhyXCOtODUoNuHfq AD23+OICwH28W+zu14ms8/EFuP7+suDQ8dpxSj6hzMXWA4ArXHfHUaS2Jo0okfvjKh8E dBJ19bsxCKYu8NwTOhc0x3ef8GKf0WU6ayuYc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=QikkURkohMdyJYDueRVpcaH+sCbxKmtABQgy8zSaRRUyMWRuscE7O7gRGQ5bQbxMOy vOvpHJGOJJdObvAOfDkt3ZTwdrVr03fTtu6OKsS719Wcv65cX+Hj9zMHNKuGR0RqwJML oiyc/KkDKkPzMyi6kkwwG3OTpF//7QA9Nm5ZA= Received: by 10.224.199.6 with SMTP id eq6mr4503915qab.272.1289236803254; Mon, 08 Nov 2010 09:20:03 -0800 (PST) Received: from earth ([183.87.49.109]) by mx.google.com with ESMTPS id y21sm79987yhc.14.2010.11.08.09.20.00 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 08 Nov 2010 09:20:02 -0800 (PST) Date: Mon, 8 Nov 2010 22:51:39 +0530 From: Aditya Sarawgi To: Gleb Kurtsou Message-ID: <20101108172136.GA2066@earth> References: <20101108143130.GA2799@tops> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101108143130.GA2799@tops> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 17:42:07 -0000 On Mon, Nov 08, 2010 at 04:31:30PM +0200, Gleb Kurtsou wrote: > On (08/11/2010 13:28), Ivan Voras wrote: > > I was looking at fusefs sources and there is a dance it does with the > > Giant lock which looks fishy. > It's intended to be fishy. No kernel level locks should be held before > returning to userland, in other words on each syscall vnode is locked (+ > Gaint lock for fs if needed), than it's unlocked by filesystem and > relocked upon callback from userspace. puffs is MPSAFE if that could be > of any help for you. > > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of > > mentionings, but if I understand it correctly only these "active" instances: > > > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure > > out what they're guarding > > 2) some manual locking and unlocking in nfsclient which appears to only > > guard printf() (???) > Somewhat unrelated, but. Does NFS client unlock vnodes while > sending/waiting for RCP reply? I thought it does, but I'm not sure. > > > 3) some more locking in nfsserver which apparently is only there to > > guard the underlying local file system > > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but > > doesn't do much of its own interfacing with it > > > > Except for these, is there any more magic that would need to be resolved > > to excise Giant from VFS? > Kostik was working on it. > > > Would it be correct to think that coda is the single biggest obstacle? > Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but > MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs, > smbfs, nwfs, ntfs, etc. > ext2fs on 9-CURRENT is MPSAFE. > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 17:49:07 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 803A71065675 for ; Mon, 8 Nov 2010 17:49:07 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id EF7368FC1D for ; Mon, 8 Nov 2010 17:49:06 +0000 (UTC) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA8Hmnsq085404; Mon, 8 Nov 2010 18:49:04 +0100 (CET) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA8HmnLS085403; Mon, 8 Nov 2010 18:48:49 +0100 (CET) (envelope-from olli) Date: Mon, 8 Nov 2010 18:48:49 +0100 (CET) Message-Id: <201011081748.oA8HmnLS085403@lurza.secnetix.de> From: Oliver Fromme To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com In-Reply-To: X-Newsgroups: list.freebsd-fs User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (FreeBSD/6.4-PRERELEASE-20080904 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5 (lurza.secnetix.de [127.0.0.1]); Mon, 08 Nov 2010 18:49:05 +0100 (CET) Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 17:49:07 -0000 Monthadar Al Jaberi wrote: > I dont know if I am asking on the wrong place. But it has todo with > filesystem and onboard flash (16MB) on a RouterStation Pro board. > > I am running a FreeBSD Current 201010, with the kernel configuration > file specified in /usr/src/sys/mips/conf/AR71XX with device > geom_redboot. > > but I get this error when I try to mount from flash: > mount /dev/redboot/fs /var/fs > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size > 8192: Invalid argument > > So I guessed it has todo with the flash configured in 64k sectors > according to the boot output. > ... > mx25l0: at cs 0 on spibus0 > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors > ... Historically UFS/FFS supports only 512 bytes per sector. I think it was patched at some point in the past to support 2048 bytes per sector, too, which is used by some MOD media and DVD-RAM. I'm pretty sure it does _not_ support 65536 bytes per sector (someone please correct me if I'm wrong). > So I just tried to change SBLOCKSIZE from 8129 to 65536 in > /usr/src/sys/ufs/ffs/fs.h, but then I got this error: That won't work. The media sector size is a hard limit; the driver will refuse to read or write anything that is not aligned to the media sector size. Changing the size of the super block (SBLOCKSIZE) won't help much. > mount /dev/redboot/fs /mnt/fs > g_vfs_done():redboot/fs[READ(offset=8192, length=65536)]error = 22 > mount: /dev/redboot/fs : Invalid argument The UFS code tries to read the super block at offset 8192, which is not aligned correctly (it's not a multiple of the sector size). I think UFS is not the right file system to put on a flash media that has 256 sectors of 65536 bytes. In theory you could insert a translation layer that converts 512-byte access to 65536-byte access (requiring a read-modify-write operation when writing). Maybe gnop(8) can do this, it has a sector size option, but I haven't tried it. Anyway, that would be extremely inefficient. Another possibility is to create a memory device with mdconfig(8) and use "dd bs=65536" to copy the contents of the flash device to the memory disk, then you can mount the memory disk. When you need to write any modifications back to the flash device, you have to umount the memory disk and use dd again (with if= an of= reversed, of course). Make sure that the size of the memory disk is a multiple of 65536, too. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "Documentation is like sex; when it's good, it's very, very good, and when it's bad, it's better than nothing." -- Dick Brandon From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 17:53:49 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4837A10656B0 for ; Mon, 8 Nov 2010 17:53:49 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from out-0.mx.aerioconnect.net (out-0-37.mx.aerioconnect.net [216.240.47.97]) by mx1.freebsd.org (Postfix) with ESMTP id 274668FC19 for ; Mon, 8 Nov 2010 17:53:48 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id oA8HrBZl019638; Mon, 8 Nov 2010 09:53:31 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id 66F7E2D610B; Fri, 5 Nov 2010 22:15:58 -0700 (PDT) Message-ID: <4CD4E492.3090002@freebsd.org> Date: Fri, 05 Nov 2010 22:16:02 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 MIME-Version: 1.0 To: "Mikhail T." References: <4CD04AEC.8040607@aldan.algebra.com> <4CD051A9.7090200@freebsd.org> <4CD0660E.2000102@aldan.algebra.com> <4CD06C4B.80100@freebsd.org> <4CD0895A.5030402@aldan.algebra.com> <4CD09830.3030400@freebsd.org> <4CD48F81.1080201@aldan.algebra.com> In-Reply-To: <4CD48F81.1080201@aldan.algebra.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: fs@freebsd.org Subject: Re: iozone-ing an SSD (Re: Using an SSD "disk" for /) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 17:53:49 -0000 On 11/5/10 4:13 PM, Mikhail T. wrote: > Hello! > > So, after an earlier inquiry, I went ahead and purchased an SSD > (Crucial's CTFDDAC128MAG-1G1) and put it to some testing today. > > The computer is Dell Poweredge 2900, running FreeBSD-8.1/amd64 (the > October 10th snapshot). Generic kernel. The system drive (for now) is > traditional "real" HD -- a 15K RPM by Fujitsu (MAX3073RC), I ran `iozone > -a' 4 times: > > 1. On /var/tmp -- freshly newfs-ed by the sysinstall on the Fujitsu > drive (/dev/da0). > 2. On the SSD (/dev/ad4) freshly newfs-ed by me without ANY options > (no softupdates). > 3. On the SSD (/dev/ad4) freshly newfs-ed by me with very large -e > and -a options. Reading the man-page, I figured, any parameters > mentioning "cylinders" can be set to very large values... > 4. On the SSD (/dev/da1) connected to the server's mpt-controller, > rather than the plain SATA port -- using the same filesystem created in > 3. above (no reformatting). (The 2.5" can't be secured in the 3.5" slot > and is simply hanging in the air on the SATA/SAS connectors.) > > The results can be found in 4 HTML files found at: > http://aldan.algebra.com/~mi/io/ (The original iozone-created Excel > files are there too.) > > They puzzle... Fujitsu, for example, is not an OBVIOUS loser -- it beats > the SSD in a number of file-size record-length combinations. I also > can't explain, the differences between different takes on the SSD. > > And, lastly, there is a surprising (to me) spike in "Record Rewrite" > throughput -- for both SSD and HD -- for large files when the reclen is > 64. Using reclen of 128 results in much worsening throughput -- > especially for the Fujitsu. > > I wonder, if these data can be exploited to come up with better newfs > parameters for the modern disks (SSD and not)... Comments? Thanks! I have no idea about that brand of ssd, but the industry benchmark for disk-replacement SSDs at the moment are the newest intel drives. > -mi > From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 18:00:35 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 144271065673; Mon, 8 Nov 2010 18:00:35 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6797D8FC18; Mon, 8 Nov 2010 18:00:34 +0000 (UTC) Received: by bwz3 with SMTP id 3so5134163bwz.13 for ; Mon, 08 Nov 2010 10:00:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=umDWEwINQ2QVEKnfcknVxs77tOY/6h8O7vg+W2V2o+E=; b=wwcR35J9uSsP0o3UJhgBZRFKTLJepY8bKGhpZ9m9yyq5fbhMaT+t4KqbMaraDB0QS3 O8ek0BlF9OhEgzlEAi6vJySDAemQOJ+bsge0ThmW2WXVbaN84pO/q21SQpILLnAgw9pS xfOSFhz4limUZzrSK5YeuVpeOBHmLMT4fWoS8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=XTNB99wqrByXu2Kh8E98jTiKx2M5iMg5XCNLXL20UFG1WnFKJOimOO9QnB6Vd7ANVm wydXOJJYq0VfSrzFVfn1fwLevTolImus0p75wcFWtRguxYqPREtynFg/2I1qIPkJDng8 JEdt8A+GvpC+aqVuDpWZI6qiGytBn3ctOdios= Received: by 10.204.120.136 with SMTP id d8mr5093014bkr.152.1289239233254; Mon, 08 Nov 2010 10:00:33 -0800 (PST) Received: from localhost ([91.187.5.20]) by mx.google.com with ESMTPS id v25sm148936bkt.18.2010.11.08.10.00.31 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 08 Nov 2010 10:00:32 -0800 (PST) Date: Mon, 8 Nov 2010 20:00:28 +0200 From: Gleb Kurtsou To: Aditya Sarawgi Message-ID: <20101108180028.GA3964@tops> References: <20101108143130.GA2799@tops> <20101108172136.GA2066@earth> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20101108172136.GA2066@earth> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 18:00:35 -0000 On (08/11/2010 22:51), Aditya Sarawgi wrote: > On Mon, Nov 08, 2010 at 04:31:30PM +0200, Gleb Kurtsou wrote: > > On (08/11/2010 13:28), Ivan Voras wrote: > > > I was looking at fusefs sources and there is a dance it does with the > > > Giant lock which looks fishy. > > It's intended to be fishy. No kernel level locks should be held before > > returning to userland, in other words on each syscall vnode is locked (+ > > Gaint lock for fs if needed), than it's unlocked by filesystem and > > relocked upon callback from userspace. puffs is MPSAFE if that could be > > of any help for you. > > > > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of > > > mentionings, but if I understand it correctly only these "active" instances: > > > > > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure > > > out what they're guarding > > > 2) some manual locking and unlocking in nfsclient which appears to only > > > guard printf() (???) > > Somewhat unrelated, but. Does NFS client unlock vnodes while > > sending/waiting for RCP reply? I thought it does, but I'm not sure. > > > > > 3) some more locking in nfsserver which apparently is only there to > > > guard the underlying local file system > > > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but > > > doesn't do much of its own interfacing with it > > > > > > Except for these, is there any more magic that would need to be resolved > > > to excise Giant from VFS? > > Kostik was working on it. > > > > > Would it be correct to think that coda is the single biggest obstacle? > > Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but > > MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs, > > smbfs, nwfs, ntfs, etc. > > > > ext2fs on 9-CURRENT is MPSAFE. Didn't check it for a while, sorry. But there's a deadlock in ext2_rename, it doesn't following vnode locking order (parent -> child) by doing vn_lock(fvp). The problem can't be fixed in a generic way at the moment, the best solution would probably be to follow UFS and unlock all vnodes, lock one-by-one and relookup. The same applies to tmpfs. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 18:16:14 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 62D8B1065693; Mon, 8 Nov 2010 18:16:14 +0000 (UTC) (envelope-from sarawgi.aditya@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 057F58FC12; Mon, 8 Nov 2010 18:16:13 +0000 (UTC) Received: by gxk9 with SMTP id 9so3767691gxk.13 for ; Mon, 08 Nov 2010 10:16:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=ZvfRvxD+AWPjqTosS2n/4a1mmbAGx28wQ3OTCTC3S1s=; b=gqGWVnl/HTXj/fV0zAOnXzy3Ze9cuGUcgqXLcld5H0JsqN6Mwzwy8Jx3S1oguJ/yth RING6XwWByB4n4jXZdTp3vS7ub25/GIxacBGApLWB+4C0ABo4gBSiO82KKHAMxWZ9qwb l+LRoKMGRymYL6EQSk8poziNg4pM38RxWqFQ4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=kbKh8QwRk7RQkI/zkr/7M0BH59f6PqZzGOCSSQsXQQAVvI/uwB2AMY+1fyh/0FX3oj 0wwBh13TsxMP/geP+3C0ufP6joj2abTuU8gN2Ogd3nVnZQDMSpKyoHfKVbZwMylqkAZa rIRp3sX2OfQnEFEoVBS5VtEv1Cc4ndtyrtO74= Received: by 10.150.146.17 with SMTP id t17mr939166ybd.337.1289240173058; Mon, 08 Nov 2010 10:16:13 -0800 (PST) Received: from earth ([183.87.49.109]) by mx.google.com with ESMTPS id v8sm113910yba.14.2010.11.08.10.16.10 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 08 Nov 2010 10:16:12 -0800 (PST) Date: Mon, 8 Nov 2010 23:47:49 +0530 From: Aditya Sarawgi To: Gleb Kurtsou Message-ID: <20101108181748.GD2066@earth> References: <20101108143130.GA2799@tops> <20101108172136.GA2066@earth> <20101108180028.GA3964@tops> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101108180028.GA3964@tops> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 18:16:14 -0000 On Mon, Nov 08, 2010 at 08:00:28PM +0200, Gleb Kurtsou wrote: > On (08/11/2010 22:51), Aditya Sarawgi wrote: > > On Mon, Nov 08, 2010 at 04:31:30PM +0200, Gleb Kurtsou wrote: > > > On (08/11/2010 13:28), Ivan Voras wrote: > > > > I was looking at fusefs sources and there is a dance it does with the > > > > Giant lock which looks fishy. > > > It's intended to be fishy. No kernel level locks should be held before > > > returning to userland, in other words on each syscall vnode is locked (+ > > > Gaint lock for fs if needed), than it's unlocked by filesystem and > > > relocked upon callback from userspace. puffs is MPSAFE if that could be > > > of any help for you. > > > > > > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of > > > > mentionings, but if I understand it correctly only these "active" instances: > > > > > > > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure > > > > out what they're guarding > > > > 2) some manual locking and unlocking in nfsclient which appears to only > > > > guard printf() (???) > > > Somewhat unrelated, but. Does NFS client unlock vnodes while > > > sending/waiting for RCP reply? I thought it does, but I'm not sure. > > > > > > > 3) some more locking in nfsserver which apparently is only there to > > > > guard the underlying local file system > > > > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but > > > > doesn't do much of its own interfacing with it > > > > > > > > Except for these, is there any more magic that would need to be resolved > > > > to excise Giant from VFS? > > > Kostik was working on it. > > > > > > > Would it be correct to think that coda is the single biggest obstacle? > > > Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but > > > MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs, > > > smbfs, nwfs, ntfs, etc. > > > > > > > ext2fs on 9-CURRENT is MPSAFE. > Didn't check it for a while, sorry. > No Problem :) > But there's a deadlock in ext2_rename, it doesn't following vnode > locking order (parent -> child) by doing vn_lock(fvp). The problem can't > be fixed in a generic way at the moment, the best solution would > probably be to follow UFS and unlock all vnodes, lock one-by-one and > relookup. The same applies to tmpfs. > Thanks for pointing this out. Saw some mails related to this earlier. Will take a look. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:01:33 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E87C3106564A for ; Mon, 8 Nov 2010 19:01:33 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81]) by mx1.freebsd.org (Postfix) with ESMTP id D32CA8FC13 for ; Mon, 8 Nov 2010 19:01:33 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 10:32:55 -0800 Message-ID: <4CD84258.6090404@llnl.gov> Date: Mon, 08 Nov 2010 10:32:56 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org, pjd@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:01:34 -0000 I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) volumes together. Here is a quick rundown of the hardware: * HP DL180 G6 w/12GB ram * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) * Winchester Hardware SAN, da2 at isp0 bus 0 scbus2 target 0 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 800.000MB/s transfers da2: Command Queueing enabled da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) As soon as I create the volume and write data to it, it is reported as being corrupted: write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 write# zpool scrub filevol001dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) write# cd /filevol001/ write# ls random.dat.1 write# md5 * MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 write# cp random.dat.1 random.dat.2 cp: random.dat.1: Input/output error write# zpool status pool: filevol001 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# zpool scrub filevol001 write# zpool status pool: filevol001 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 10:14:20 2010 config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 2.38K da2 ONLINE 0 0 1.24K 12K repaired da3 ONLINE 0 0 1.12K da4 ONLINE 0 0 1.13K da5 ONLINE 0 0 1.27K da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: 2437 data errors, use '-v' for a list However, if I create a 'raidz' volume, no errors occur: write# zpool destroy filevol001 write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 write# zpool status pool: filevol001 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) write# zpool scrub filevol001 dmesg output: write# zpool status pool: filevol001 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 09:54:51 2010 config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# ls random.dat.1 write# cp random.dat.1 random.dat.2 write# cp random.dat.1 random.dat.3 write# cp random.dat.1 random.dat.4 write# cp random.dat.1 random.dat.5 write# cp random.dat.1 random.dat.6 write# cp random.dat.1 random.dat.7 write# md5 * MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 What is also odd, is if I create 7 separate ZFS volumes, they do not report any data corruption: write# zpool destroy filevol001 write# zpool create test01 da2 write# zpool create test02 da3 write# zpool create test03 da4 write# zpool create test04 da5 write# zpool create test05 da6 write# zpool create test06 da7 write# zpool create test07 da8 write# zpool status pool: test01 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors pool: test02 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test02 ONLINE 0 0 0 da3 ONLINE 0 0 0 errors: No known data errors pool: test03 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test03 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors pool: test04 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test04 ONLINE 0 0 0 da5 ONLINE 0 0 0 errors: No known data errors pool: test05 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test05 ONLINE 0 0 0 da6 ONLINE 0 0 0 errors: No known data errors pool: test06 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test06 ONLINE 0 0 0 da7 ONLINE 0 0 0 errors: No known data errors pool: test07 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test07 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) write# cd /tmp/ write# md5 /tmp/random.dat.1 MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 write# md5 /test*/* MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; zpool scrub test07 write# zpool status pool: test01 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:49 2010 config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors pool: test02 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:52 2010 config: NAME STATE READ WRITE CKSUM test02 ONLINE 0 0 0 da3 ONLINE 0 0 0 errors: No known data errors pool: test03 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:54 2010 config: NAME STATE READ WRITE CKSUM test03 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors pool: test04 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:57 2010 config: NAME STATE READ WRITE CKSUM test04 ONLINE 0 0 0 da5 ONLINE 0 0 0 errors: No known data errors pool: test05 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:28:00 2010 config: NAME STATE READ WRITE CKSUM test05 ONLINE 0 0 0 da6 ONLINE 0 0 0 errors: No known data errors pool: test06 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:28:02 2010 config: NAME STATE READ WRITE CKSUM test06 ONLINE 0 0 0 da7 ONLINE 0 0 0 errors: No known data errors pool: test07 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:28:05 2010 config: NAME STATE READ WRITE CKSUM test07 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors Based on these results, I've drawn the following conclusion: * ZFS single pool per device = OKAY * ZFS raidz of all devices = OKAY * ZFS stripe of all devices = NOT OKAY The results are immediate, and I know ZFS will self-heal, so is that what it is doing behind my back and just not reporting it? Is this a ZFS bug with striping vs. raidz? From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:06:43 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A19011065673 for ; Mon, 8 Nov 2010 19:06:43 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.westchester.pa.mail.comcast.net (qmta01.westchester.pa.mail.comcast.net [76.96.62.16]) by mx1.freebsd.org (Postfix) with ESMTP id 4C3DF8FC20 for ; Mon, 8 Nov 2010 19:06:42 +0000 (UTC) Received: from omta19.westchester.pa.mail.comcast.net ([76.96.62.98]) by qmta01.westchester.pa.mail.comcast.net with comcast id UbpN1f00127AodY51j6jJx; Mon, 08 Nov 2010 19:06:43 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta19.westchester.pa.mail.comcast.net with comcast id Uj6i1f0043LrwQ23fj6ius; Mon, 08 Nov 2010 19:06:43 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id DA0469B427; Mon, 8 Nov 2010 11:06:40 -0800 (PST) Date: Mon, 8 Nov 2010 11:06:40 -0800 From: Jeremy Chadwick To: Mike Carlson Message-ID: <20101108190640.GA15661@icarus.home.lan> References: <4CD84258.6090404@llnl.gov> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CD84258.6090404@llnl.gov> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, pjd@freebsd.org Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:06:43 -0000 On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote: > I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) > volumes together. > > Here is a quick rundown of the hardware: > * HP DL180 G6 w/12GB ram > * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) > * Winchester Hardware SAN, > > da2 at isp0 bus 0 scbus2 target 0 lun 0 > da2: Fixed Direct Access SCSI-5 device > da2: 800.000MB/s transfers > da2: Command Queueing enabled > da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) > > > As soon as I create the volume and write data to it, it is reported > as being corrupted: > > write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 > write# zpool scrub filevol001dd if=/dev/random > of=/filevol001/random.dat.1 bs=1m count=1000 > write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) > write# cd /filevol001/ > write# ls > random.dat.1 > write# md5 * > MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 > write# cp random.dat.1 random.dat.2 > cp: random.dat.1: Input/output error > write# zpool status > pool: filevol001 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > filevol001 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > > errors: No known data errors > write# zpool scrub filevol001 > write# zpool status > pool: filevol001 > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 > 10:14:20 2010 > config: > > NAME STATE READ WRITE CKSUM > filevol001 ONLINE 0 0 2.38K > da2 ONLINE 0 0 1.24K 12K repaired > da3 ONLINE 0 0 1.12K > da4 ONLINE 0 0 1.13K > da5 ONLINE 0 0 1.27K > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > > errors: 2437 data errors, use '-v' for a list > > However, if I create a 'raidz' volume, no errors occur: > > write# zpool destroy filevol001 > write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 > write# zpool status > pool: filevol001 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > filevol001 ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > > errors: No known data errors > write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) > write# zpool scrub filevol001 > > dmesg output: > write# zpool status > pool: filevol001 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 09:54:51 2010 > config: > > NAME STATE READ WRITE CKSUM > filevol001 ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > > errors: No known data errors > write# ls > random.dat.1 > write# cp random.dat.1 random.dat.2 > write# cp random.dat.1 random.dat.3 > write# cp random.dat.1 random.dat.4 > write# cp random.dat.1 random.dat.5 > write# cp random.dat.1 random.dat.6 > write# cp random.dat.1 random.dat.7 > write# md5 * > MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 > MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 > MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 > MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 > MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 > MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 > MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 > > What is also odd, is if I create 7 separate ZFS volumes, they do not > report any data corruption: > > write# zpool destroy filevol001 > write# zpool create test01 da2 > write# zpool create test02 da3 > write# zpool create test03 da4 > write# zpool create test04 da5 > write# zpool create test05 da6 > write# zpool create test06 da7 > write# zpool create test07 da8 > write# zpool status > pool: test01 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test01 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > > errors: No known data errors > > pool: test02 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test02 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > > errors: No known data errors > > pool: test03 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test03 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > > errors: No known data errors > > pool: test04 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test04 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > > errors: No known data errors > > pool: test05 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test05 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > > errors: No known data errors > > pool: test06 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test06 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > > errors: No known data errors > > pool: test07 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > test07 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > > errors: No known data errors > write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) > write# cd /tmp/ > write# md5 /tmp/random.dat.1 > MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp > random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 > /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 > write# md5 /test*/* > MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 > ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; > zpool scrub test07 > write# zpool status > pool: test01 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:27:49 2010 > config: > > NAME STATE READ WRITE CKSUM > test01 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > > errors: No known data errors > > pool: test02 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:27:52 2010 > config: > > NAME STATE READ WRITE CKSUM > test02 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > > errors: No known data errors > > pool: test03 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:27:54 2010 > config: > > NAME STATE READ WRITE CKSUM > test03 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > > errors: No known data errors > > pool: test04 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:27:57 2010 > config: > > NAME STATE READ WRITE CKSUM > test04 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > > errors: No known data errors > > pool: test05 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:28:00 2010 > config: > > NAME STATE READ WRITE CKSUM > test05 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > > errors: No known data errors > > pool: test06 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:28:02 2010 > config: > > NAME STATE READ WRITE CKSUM > test06 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > > errors: No known data errors > > pool: test07 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > 10:28:05 2010 > config: > > NAME STATE READ WRITE CKSUM > test07 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > > errors: No known data errors > > Based on these results, I've drawn the following conclusion: > * ZFS single pool per device = OKAY > * ZFS raidz of all devices = OKAY > * ZFS stripe of all devices = NOT OKAY > > The results are immediate, and I know ZFS will self-heal, so is that > what it is doing behind my back and just not reporting it? Is this a > ZFS bug with striping vs. raidz? Can you reproduce this problem using RELENG_8? Please try one of the below snapshots. ftp://ftp4.freebsd.org/pub/FreeBSD/snapshots/201011/ -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:11:30 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9C0A1065670; Mon, 8 Nov 2010 19:11:30 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81]) by mx1.freebsd.org (Postfix) with ESMTP id C9E918FC1B; Mon, 8 Nov 2010 19:11:30 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 11:11:30 -0800 Message-ID: <4CD84B63.4030800@llnl.gov> Date: Mon, 08 Nov 2010 11:11:31 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CD84258.6090404@llnl.gov> <20101108190640.GA15661@icarus.home.lan> In-Reply-To: <20101108190640.GA15661@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" , "pjd@freebsd.org" Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:11:31 -0000 On 11/08/2010 11:06 AM, Jeremy Chadwick wrote: > On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote: >> I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) >> volumes together. >> >> Here is a quick rundown of the hardware: >> * HP DL180 G6 w/12GB ram >> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) >> * Winchester Hardware SAN, >> >> da2 at isp0 bus 0 scbus2 target 0 lun 0 >> da2: Fixed Direct Access SCSI-5 device >> da2: 800.000MB/s transfers >> da2: Command Queueing enabled >> da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) >> >> >> As soon as I create the volume and write data to it, it is reported >> as being corrupted: >> >> write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 >> write# zpool scrub filevol001dd if=/dev/random >> of=/filevol001/random.dat.1 bs=1m count=1000 >> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) >> write# cd /filevol001/ >> write# ls >> random.dat.1 >> write# md5 * >> MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 >> write# cp random.dat.1 random.dat.2 >> cp: random.dat.1: Input/output error >> write# zpool status >> pool: filevol001 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# zpool scrub filevol001 >> write# zpool status >> pool: filevol001 >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://BLOCKEDwww.BLOCKEDsun.com/msg/ZFS-8000-8A >> scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 >> 10:14:20 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 2.38K >> da2 ONLINE 0 0 1.24K 12K repaired >> da3 ONLINE 0 0 1.12K >> da4 ONLINE 0 0 1.13K >> da5 ONLINE 0 0 1.27K >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: 2437 data errors, use '-v' for a list >> >> However, if I create a 'raidz' volume, no errors occur: >> >> write# zpool destroy filevol001 >> write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 >> write# zpool status >> pool: filevol001 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) >> write# zpool scrub filevol001 >> >> dmesg output: >> write# zpool status >> pool: filevol001 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 09:54:51 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# ls >> random.dat.1 >> write# cp random.dat.1 random.dat.2 >> write# cp random.dat.1 random.dat.3 >> write# cp random.dat.1 random.dat.4 >> write# cp random.dat.1 random.dat.5 >> write# cp random.dat.1 random.dat.6 >> write# cp random.dat.1 random.dat.7 >> write# md5 * >> MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> >> What is also odd, is if I create 7 separate ZFS volumes, they do not >> report any data corruption: >> >> write# zpool destroy filevol001 >> write# zpool create test01 da2 >> write# zpool create test02 da3 >> write# zpool create test03 da4 >> write# zpool create test04 da5 >> write# zpool create test05 da6 >> write# zpool create test06 da7 >> write# zpool create test07 da8 >> write# zpool status >> pool: test01 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test01 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test02 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test02 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test03 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test03 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test04 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test04 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test05 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test05 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test06 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test06 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test07 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test07 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) >> write# cd /tmp/ >> write# md5 /tmp/random.dat.1 >> MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp >> random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 >> /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 >> write# md5 /test*/* >> MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 >> ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; >> zpool scrub test07 >> write# zpool status >> pool: test01 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:49 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test01 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test02 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:52 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test02 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test03 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:54 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test03 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test04 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:57 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test04 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test05 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:28:00 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test05 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test06 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:28:02 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test06 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test07 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:28:05 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test07 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> >> Based on these results, I've drawn the following conclusion: >> * ZFS single pool per device = OKAY >> * ZFS raidz of all devices = OKAY >> * ZFS stripe of all devices = NOT OKAY >> >> The results are immediate, and I know ZFS will self-heal, so is that >> what it is doing behind my back and just not reporting it? Is this a >> ZFS bug with striping vs. raidz? > Can you reproduce this problem using RELENG_8? Please try one of the > below snapshots. > > ftp://BLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/ > The server is in a data center with limited access control, do I have to option of using a particular CVS tag (checking out via csup) and then perform a make world/kernel? If so, I can report back later today, otherwise it might take longer :( Mike C From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:29:54 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 52DE41065673 for ; Mon, 8 Nov 2010 19:29:53 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta02.westchester.pa.mail.comcast.net (qmta02.westchester.pa.mail.comcast.net [76.96.62.24]) by mx1.freebsd.org (Postfix) with ESMTP id F14618FC17 for ; Mon, 8 Nov 2010 19:29:52 +0000 (UTC) Received: from omta12.westchester.pa.mail.comcast.net ([76.96.62.44]) by qmta02.westchester.pa.mail.comcast.net with comcast id UcB91f0040xGWP852jVtav; Mon, 08 Nov 2010 19:29:53 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta12.westchester.pa.mail.comcast.net with comcast id UjVs1f0013LrwQ23YjVskQ; Mon, 08 Nov 2010 19:29:53 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id CD4379B427; Mon, 8 Nov 2010 11:29:50 -0800 (PST) Date: Mon, 8 Nov 2010 11:29:50 -0800 From: Jeremy Chadwick To: Mike Carlson Message-ID: <20101108192950.GA15902@icarus.home.lan> References: <4CD84258.6090404@llnl.gov> <20101108190640.GA15661@icarus.home.lan> <4CD84B63.4030800@llnl.gov> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CD84B63.4030800@llnl.gov> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "freebsd-fs@freebsd.org" , "pjd@freebsd.org" Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:29:55 -0000 On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote: > On 11/08/2010 11:06 AM, Jeremy Chadwick wrote: > >On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote: > >>I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) > >>volumes together. > >> > >>Here is a quick rundown of the hardware: > >>* HP DL180 G6 w/12GB ram > >>* QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) > >>* Winchester Hardware SAN, > >> > >> da2 at isp0 bus 0 scbus2 target 0 lun 0 > >> da2: Fixed Direct Access SCSI-5 device > >> da2: 800.000MB/s transfers > >> da2: Command Queueing enabled > >> da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) > >> > >> > >>As soon as I create the volume and write data to it, it is reported > >>as being corrupted: > >> > >> write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 > >> write# zpool scrub filevol001dd if=/dev/random > >> of=/filevol001/random.dat.1 bs=1m count=1000 > >> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 > >> 1000+0 records in > >> 1000+0 records out > >> 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) > >> write# cd /filevol001/ > >> write# ls > >> random.dat.1 > >> write# md5 * > >> MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 > >> write# cp random.dat.1 random.dat.2 > >> cp: random.dat.1: Input/output error > >> write# zpool status > >> pool: filevol001 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> filevol001 ONLINE 0 0 0 > >> da2 ONLINE 0 0 0 > >> da3 ONLINE 0 0 0 > >> da4 ONLINE 0 0 0 > >> da5 ONLINE 0 0 0 > >> da6 ONLINE 0 0 0 > >> da7 ONLINE 0 0 0 > >> da8 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> write# zpool scrub filevol001 > >> write# zpool status > >> pool: filevol001 > >> state: ONLINE > >> status: One or more devices has experienced an error resulting in data > >> corruption. Applications may be affected. > >> action: Restore the file in question if possible. Otherwise restore the > >> entire pool from backup. > >> see: http://BLOCKEDwww.BLOCKEDsun.com/msg/ZFS-8000-8A > >> scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 > >> 10:14:20 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> filevol001 ONLINE 0 0 2.38K > >> da2 ONLINE 0 0 1.24K 12K repaired > >> da3 ONLINE 0 0 1.12K > >> da4 ONLINE 0 0 1.13K > >> da5 ONLINE 0 0 1.27K > >> da6 ONLINE 0 0 0 > >> da7 ONLINE 0 0 0 > >> da8 ONLINE 0 0 0 > >> > >> errors: 2437 data errors, use '-v' for a list > >> > >>However, if I create a 'raidz' volume, no errors occur: > >> > >> write# zpool destroy filevol001 > >> write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 > >> write# zpool status > >> pool: filevol001 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> filevol001 ONLINE 0 0 0 > >> raidz1 ONLINE 0 0 0 > >> da2 ONLINE 0 0 0 > >> da3 ONLINE 0 0 0 > >> da4 ONLINE 0 0 0 > >> da5 ONLINE 0 0 0 > >> da6 ONLINE 0 0 0 > >> da7 ONLINE 0 0 0 > >> da8 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 > >> 1000+0 records in > >> 1000+0 records out > >> 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) > >> write# zpool scrub filevol001 > >> > >> dmesg output: > >> write# zpool status > >> pool: filevol001 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 09:54:51 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> filevol001 ONLINE 0 0 0 > >> raidz1 ONLINE 0 0 0 > >> da2 ONLINE 0 0 0 > >> da3 ONLINE 0 0 0 > >> da4 ONLINE 0 0 0 > >> da5 ONLINE 0 0 0 > >> da6 ONLINE 0 0 0 > >> da7 ONLINE 0 0 0 > >> da8 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> write# ls > >> random.dat.1 > >> write# cp random.dat.1 random.dat.2 > >> write# cp random.dat.1 random.dat.3 > >> write# cp random.dat.1 random.dat.4 > >> write# cp random.dat.1 random.dat.5 > >> write# cp random.dat.1 random.dat.6 > >> write# cp random.dat.1 random.dat.7 > >> write# md5 * > >> MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 > >> > >>What is also odd, is if I create 7 separate ZFS volumes, they do not > >>report any data corruption: > >> > >> write# zpool destroy filevol001 > >> write# zpool create test01 da2 > >> write# zpool create test02 da3 > >> write# zpool create test03 da4 > >> write# zpool create test04 da5 > >> write# zpool create test05 da6 > >> write# zpool create test06 da7 > >> write# zpool create test07 da8 > >> write# zpool status > >> pool: test01 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test01 ONLINE 0 0 0 > >> da2 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test02 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test02 ONLINE 0 0 0 > >> da3 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test03 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test03 ONLINE 0 0 0 > >> da4 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test04 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test04 ONLINE 0 0 0 > >> da5 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test05 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test05 ONLINE 0 0 0 > >> da6 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test06 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test06 ONLINE 0 0 0 > >> da7 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test07 > >> state: ONLINE > >> scrub: none requested > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test07 ONLINE 0 0 0 > >> da8 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 > >> 1000+0 records in > >> 1000+0 records out > >> 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) > >> write# cd /tmp/ > >> write# md5 /tmp/random.dat.1 > >> MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp > >> random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 > >> /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 > >> write# md5 /test*/* > >> MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > >> write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 > >> ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; > >> zpool scrub test07 > >> write# zpool status > >> pool: test01 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:27:49 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test01 ONLINE 0 0 0 > >> da2 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test02 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:27:52 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test02 ONLINE 0 0 0 > >> da3 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test03 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:27:54 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test03 ONLINE 0 0 0 > >> da4 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test04 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:27:57 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test04 ONLINE 0 0 0 > >> da5 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test05 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:28:00 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test05 ONLINE 0 0 0 > >> da6 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test06 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:28:02 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test06 ONLINE 0 0 0 > >> da7 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >> pool: test07 > >> state: ONLINE > >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 > >> 10:28:05 2010 > >> config: > >> > >> NAME STATE READ WRITE CKSUM > >> test07 ONLINE 0 0 0 > >> da8 ONLINE 0 0 0 > >> > >> errors: No known data errors > >> > >>Based on these results, I've drawn the following conclusion: > >>* ZFS single pool per device = OKAY > >>* ZFS raidz of all devices = OKAY > >>* ZFS stripe of all devices = NOT OKAY > >> > >>The results are immediate, and I know ZFS will self-heal, so is that > >>what it is doing behind my back and just not reporting it? Is this a > >>ZFS bug with striping vs. raidz? > >Can you reproduce this problem using RELENG_8? Please try one of the > >below snapshots. > > > >ftp://BLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/ > > > The server is in a data center with limited access control, do I > have to option of using a particular CVS tag (checking out via csup) > and then perform a make world/kernel? Doing this is more painful than, say, downloading a livefs image and seeing if you can reproduce the problem (e.g. you won't be modifying your existing OS installation), especially since I can't guarantee that the problem you're seeing is fixed in RELENG_8 (hence my request to begin with). But if you can't boot livefs, then here you go: You'll need some form of console access (either serial or VGA) to do the upgrade reliably. "Rolling back" may also not be an option since RELENG_8 is newer than RELENG_8_1 and may have introduced some new binaries or executables into the fray. If you don't have console access to this machine, if things go awry you may be SOL. The vagueness of my statement is intentional; I can't cover every situation that might come to light. Please be sure to back up your kernel configuration file before doing the following, and make sure that the supfile shown below has tag=RELENG_8 in it (it should). And yes, the rm commands below are recommended; failure to use them could result in some oddities given that your /usr/src tree refers to RELENG_8_1 version numbers which differ from RELENG_8. You *do not* have to do this for ports (since for ports, tag=. is used by default). rm -fr /var/db/sup/src-all rm -fr /usr/src/* rm -fr /usr/obj/* csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile At this point you can restore your kernel configuration file to the appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build world/kernel as per the instructions in /usr/src/Makefile (see lines ~51-62). ***Please do not skip any of the steps***. Good luck. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:32:04 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54A6C1065672; Mon, 8 Nov 2010 19:32:04 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81]) by mx1.freebsd.org (Postfix) with ESMTP id 3D4CA8FC25; Mon, 8 Nov 2010 19:32:04 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 11:32:03 -0800 Message-ID: <4CD85034.5000909@llnl.gov> Date: Mon, 08 Nov 2010 11:32:04 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CD84258.6090404@llnl.gov> <20101108190640.GA15661@icarus.home.lan> <4CD84B63.4030800@llnl.gov> <20101108192950.GA15902@icarus.home.lan> In-Reply-To: <20101108192950.GA15902@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" , "pjd@freebsd.org" Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:32:04 -0000 On 11/08/2010 11:29 AM, Jeremy Chadwick wrote: > On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote: >> On 11/08/2010 11:06 AM, Jeremy Chadwick wrote: >>> On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote: >>>> I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) >>>> volumes together. >>>> >>>> Here is a quick rundown of the hardware: >>>> * HP DL180 G6 w/12GB ram >>>> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) >>>> * Winchester Hardware SAN, >>>> >>>> da2 at isp0 bus 0 scbus2 target 0 lun 0 >>>> da2: Fixed Direct Access SCSI-5 device >>>> da2: 800.000MB/s transfers >>>> da2: Command Queueing enabled >>>> da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) >>>> >>>> >>>> As soon as I create the volume and write data to it, it is reported >>>> as being corrupted: >>>> >>>> write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 >>>> write# zpool scrub filevol001dd if=/dev/random >>>> of=/filevol001/random.dat.1 bs=1m count=1000 >>>> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 >>>> 1000+0 records in >>>> 1000+0 records out >>>> 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) >>>> write# cd /filevol001/ >>>> write# ls >>>> random.dat.1 >>>> write# md5 * >>>> MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 >>>> write# cp random.dat.1 random.dat.2 >>>> cp: random.dat.1: Input/output error >>>> write# zpool status >>>> pool: filevol001 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> filevol001 ONLINE 0 0 0 >>>> da2 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> da5 ONLINE 0 0 0 >>>> da6 ONLINE 0 0 0 >>>> da7 ONLINE 0 0 0 >>>> da8 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> write# zpool scrub filevol001 >>>> write# zpool status >>>> pool: filevol001 >>>> state: ONLINE >>>> status: One or more devices has experienced an error resulting in data >>>> corruption. Applications may be affected. >>>> action: Restore the file in question if possible. Otherwise restore the >>>> entire pool from backup. >>>> see: http://BLOCKEDBLOCKEDwww.BLOCKEDBLOCKEDsun.com/msg/ZFS-8000-8A >>>> scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 >>>> 10:14:20 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> filevol001 ONLINE 0 0 2.38K >>>> da2 ONLINE 0 0 1.24K 12K repaired >>>> da3 ONLINE 0 0 1.12K >>>> da4 ONLINE 0 0 1.13K >>>> da5 ONLINE 0 0 1.27K >>>> da6 ONLINE 0 0 0 >>>> da7 ONLINE 0 0 0 >>>> da8 ONLINE 0 0 0 >>>> >>>> errors: 2437 data errors, use '-v' for a list >>>> >>>> However, if I create a 'raidz' volume, no errors occur: >>>> >>>> write# zpool destroy filevol001 >>>> write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 >>>> write# zpool status >>>> pool: filevol001 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> filevol001 ONLINE 0 0 0 >>>> raidz1 ONLINE 0 0 0 >>>> da2 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> da5 ONLINE 0 0 0 >>>> da6 ONLINE 0 0 0 >>>> da7 ONLINE 0 0 0 >>>> da8 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 >>>> 1000+0 records in >>>> 1000+0 records out >>>> 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) >>>> write# zpool scrub filevol001 >>>> >>>> dmesg output: >>>> write# zpool status >>>> pool: filevol001 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 09:54:51 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> filevol001 ONLINE 0 0 0 >>>> raidz1 ONLINE 0 0 0 >>>> da2 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> da5 ONLINE 0 0 0 >>>> da6 ONLINE 0 0 0 >>>> da7 ONLINE 0 0 0 >>>> da8 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> write# ls >>>> random.dat.1 >>>> write# cp random.dat.1 random.dat.2 >>>> write# cp random.dat.1 random.dat.3 >>>> write# cp random.dat.1 random.dat.4 >>>> write# cp random.dat.1 random.dat.5 >>>> write# cp random.dat.1 random.dat.6 >>>> write# cp random.dat.1 random.dat.7 >>>> write# md5 * >>>> MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 >>>> >>>> What is also odd, is if I create 7 separate ZFS volumes, they do not >>>> report any data corruption: >>>> >>>> write# zpool destroy filevol001 >>>> write# zpool create test01 da2 >>>> write# zpool create test02 da3 >>>> write# zpool create test03 da4 >>>> write# zpool create test04 da5 >>>> write# zpool create test05 da6 >>>> write# zpool create test06 da7 >>>> write# zpool create test07 da8 >>>> write# zpool status >>>> pool: test01 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test01 ONLINE 0 0 0 >>>> da2 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test02 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test02 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test03 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test03 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test04 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test04 ONLINE 0 0 0 >>>> da5 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test05 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test05 ONLINE 0 0 0 >>>> da6 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test06 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test06 ONLINE 0 0 0 >>>> da7 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test07 >>>> state: ONLINE >>>> scrub: none requested >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test07 ONLINE 0 0 0 >>>> da8 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 >>>> 1000+0 records in >>>> 1000+0 records out >>>> 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) >>>> write# cd /tmp/ >>>> write# md5 /tmp/random.dat.1 >>>> MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp >>>> random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 >>>> /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 >>>> write# md5 /test*/* >>>> MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >>>> write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 >>>> ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; >>>> zpool scrub test07 >>>> write# zpool status >>>> pool: test01 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:27:49 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test01 ONLINE 0 0 0 >>>> da2 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test02 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:27:52 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test02 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test03 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:27:54 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test03 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test04 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:27:57 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test04 ONLINE 0 0 0 >>>> da5 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test05 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:28:00 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test05 ONLINE 0 0 0 >>>> da6 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test06 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:28:02 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test06 ONLINE 0 0 0 >>>> da7 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> pool: test07 >>>> state: ONLINE >>>> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >>>> 10:28:05 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> test07 ONLINE 0 0 0 >>>> da8 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> Based on these results, I've drawn the following conclusion: >>>> * ZFS single pool per device = OKAY >>>> * ZFS raidz of all devices = OKAY >>>> * ZFS stripe of all devices = NOT OKAY >>>> >>>> The results are immediate, and I know ZFS will self-heal, so is that >>>> what it is doing behind my back and just not reporting it? Is this a >>>> ZFS bug with striping vs. raidz? >>> Can you reproduce this problem using RELENG_8? Please try one of the >>> below snapshots. >>> >>> ftp://BLOCKEDBLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/ >>> >> The server is in a data center with limited access control, do I >> have to option of using a particular CVS tag (checking out via csup) >> and then perform a make world/kernel? > Doing this is more painful than, say, downloading a livefs image and > seeing if you can reproduce the problem (e.g. you won't be modifying > your existing OS installation), especially since I can't guarantee that > the problem you're seeing is fixed in RELENG_8 (hence my request to > begin with). But if you can't boot livefs, then here you go: > > You'll need some form of console access (either serial or VGA) to do the > upgrade reliably. "Rolling back" may also not be an option since > RELENG_8 is newer than RELENG_8_1 and may have introduced some new > binaries or executables into the fray. If you don't have console access > to this machine, if things go awry you may be SOL. The vagueness of my > statement is intentional; I can't cover every situation that might come > to light. > > Please be sure to back up your kernel configuration file before doing > the following, and make sure that the supfile shown below has > tag=RELENG_8 in it (it should). And yes, the rm commands below are > recommended; failure to use them could result in some oddities given > that your /usr/src tree refers to RELENG_8_1 version numbers which > differ from RELENG_8. You *do not* have to do this for ports (since for > ports, tag=. is used by default). > > rm -fr /var/db/sup/src-all > rm -fr /usr/src/* > rm -fr /usr/obj/* > csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile > > At this point you can restore your kernel configuration file to the > appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build > world/kernel as per the instructions in /usr/src/Makefile (see lines > ~51-62). ***Please do not skip any of the steps***. Good luck. > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://BLOCKEDwww.BLOCKEDparodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > Ahh, point taken :) I think I'll take a trip to the datacenter and boot off of a thumb drive... Thank Jeremy, I'll report back later! Mike C From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 20:39:06 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 317E91065695 for ; Mon, 8 Nov 2010 20:39:06 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx06.syd.optusnet.com.au (fallbackmx06.syd.optusnet.com.au [211.29.132.8]) by mx1.freebsd.org (Postfix) with ESMTP id BD1A58FC0A for ; Mon, 8 Nov 2010 20:39:05 +0000 (UTC) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by fallbackmx06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oA8KNSDh016903 for ; Tue, 9 Nov 2010 07:23:29 +1100 Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au (c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oA8KNPJ2032284 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 9 Nov 2010 07:23:26 +1100 Date: Tue, 9 Nov 2010 07:23:25 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: freebsd-fs@freebsd.org, monthadar@gmail.com In-Reply-To: <201011081748.oA8HmnLS085403@lurza.secnetix.de> Message-ID: <20101109065842.R2343@besplex.bde.org> References: <201011081748.oA8HmnLS085403@lurza.secnetix.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 20:39:06 -0000 On Mon, 8 Nov 2010, Oliver Fromme wrote: > Monthadar Al Jaberi wrote: > > I dont know if I am asking on the wrong place. But it has todo with > > filesystem and onboard flash (16MB) on a RouterStation Pro board. > > > > I am running a FreeBSD Current 201010, with the kernel configuration > > file specified in /usr/src/sys/mips/conf/AR71XX with device > > geom_redboot. > > > > but I get this error when I try to mount from flash: > > mount /dev/redboot/fs /var/fs > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size > > 8192: Invalid argument > > > > So I guessed it has todo with the flash configured in 64k sectors > > according to the boot output. > > ... > > mx25l0: at cs 0 on spibus0 > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors > > ... > > Historically UFS/FFS supports only 512 bytes per sector. > I think it was patched at some point in the past to support > 2048 bytes per sector, too, which is used by some MOD media > and DVD-RAM. I'm pretty sure it does _not_ support 65536 > bytes per sector (someone please correct me if I'm wrong). Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really even use sectors. It just has a buggy superblock probe which prevents it determining its correct i/o size when that size exceeds SBLOCKSIZE = 8192. > > > So I just tried to change SBLOCKSIZE from 8129 to 65536 in > > /usr/src/sys/ufs/ffs/fs.h, but then I got this error: > > That won't work. The media sector size is a hard limit; > the driver will refuse to read or write anything that is > not aligned to the media sector size. Changing the size > of the super block (SBLOCKSIZE) won't help much. > > > mount /dev/redboot/fs /mnt/fs > > g_vfs_done():redboot/fs[READ(offset=8192, length=65536)]error = 22 > > mount: /dev/redboot/fs : Invalid argument > > The UFS code tries to read the super block at offset 8192, > which is not aligned correctly (it's not a multiple of the > sector size). The ffs code (and also utility code like ffs_fsck) bogusly aborts the search on the first i/o error. Otherwise, changing just the size should work for ffs2. Changing both the offset and the size should work. I think it might just work for ffs2 once you recompile all utilities using the changed SBLOCKSEARCH and SBLOCKSIZE. The change will necessarily break ffs1: from ffs/fs.h: % * Depending on the architecture and the media, the superblock may % * reside in any one of four places. For tiny media where every block % * counts, it is placed at the very front of the partition. Historically, % * UFS1 placed it 8K from the front to leave room for the disk label and % * a small bootstrap. For UFS2 it got moved to 64K from the front to leave It can't be at 8K if the media sector size is 8K. Except, with more changes to the probe, perhaps it can be (e.g., always start by reading 128K at offset 0, and check what is at various offsets within the 128K. This covers all the usual cases in 1 i/o). % * room for the disk label and a bigger bootstrap, and for really piggy % * systems we check at 256K from the front if the first three fail. In % * all cases the size of the superblock will be SBLOCKSIZE. All values are Actually, the size of the superblock will never be SBLOCKSIZE. It will always be sizeof(struct fs), which is about 1500. SBLOCKSIZE is just the i/o size used in buggy probes for the superblock. % * given in byte-offset form, so they do not imply a sector size. The % * SBLOCKSEARCH specifies the order in which the locations should be searched. % */ % #define SBLOCK_FLOPPY 0 I don't remember this ever working. Floppies normally start with a 512-byte boot sector. Thus the superblock cannot start a 0 for a normal floppy, and I don't remember anything ever supporting sufficiently abnormal floppy for this to work. (My kernel has incomplete support for this putting the superblock at offset 512, involving reducing MINBSIZE to 512 and using 512-blocks for everything. Everything works except superblock size and probing issues. The superblock should end up beginning in the first fs_bsize block after the boot blocks (normally 512, but could be 0 with more work). % #define SBLOCK_UFS1 8192 % #define SBLOCK_UFS2 65536 % #define SBLOCK_PIGGY 262144 % #define SBLOCKSIZE 8192 % #define SBLOCKSEARCH \ % { SBLOCK_UFS2, SBLOCK_UFS1, SBLOCK_FLOPPY, SBLOCK_PIGGY, -1 } Try changing SBLOCK_UFS1 to 65536 too. Obviously this breaks the normal ffs1 case. Better, try changing only SBLOCKSIZE and not aborting on i/o error. > I think UFS is not the right file system to put on a flash > media that has 256 sectors of 65536 bytes. In theory you > could insert a translation layer that converts 512-byte > access to 65536-byte access (requiring a read-modify-write > operation when writing). Maybe gnop(8) can do this, it > has a sector size option, but I haven't tried it. Anyway, > that would be extremely inefficient. Hmm, 256 sectors is small. It might have negative space for data after using > 256 ssectors for metadata. Bruce From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 22:31:18 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97B08106564A for ; Mon, 8 Nov 2010 22:31:18 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 189F68FC26 for ; Mon, 8 Nov 2010 22:31:17 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PFaF8-0006WG-A0 for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 23:31:14 +0100 Received: from cpe-188-129-102-227.dynamic.amis.hr ([188.129.102.227]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 23:31:14 +0100 Received: from ivoras by cpe-188-129-102-227.dynamic.amis.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Nov 2010 23:31:14 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Mon, 08 Nov 2010 23:30:59 +0100 Lines: 40 Message-ID: References: <4CD04AEC.8040607@aldan.algebra.com> <4CD051A9.7090200@freebsd.org> <4CD0660E.2000102@aldan.algebra.com> <4CD06C4B.80100@freebsd.org> <4CD0895A.5030402@aldan.algebra.com> <4CD09830.3030400@freebsd.org> <4CD48F81.1080201@aldan.algebra.com> <4CD5A5B7.4040006@aldan.algebra.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cpe-188-129-102-227.dynamic.amis.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <4CD5A5B7.4040006@aldan.algebra.com> Subject: Re: iozone-ing an SSD (Re: Using an SSD "disk" for /) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 22:31:18 -0000 On 11/06/10 20:00, Mikhail T. wrote: > On 11/5/2010 7:13 PM, Mikhail T. wrote: >> The results can be found in 4 HTML files found at: >> http://aldan.algebra.com/~mi/io/ (The original iozone-created Excel >> files are there too.) That server doesn't respond! > I added some more iozone runs, as well as those of rawio. These are much > fewer (as file-system parameters don't affect rawio) and easier to > interpret: > > * It makes no difference to the SSD, whether your access is random > or sequential And this is their biggest strength. All others - like "raw" IO speed, are in the majority of serious use cases secondary to that. > * SSD clearly beats the HD in rawrite, although, at "only" 88Mb/sec, > the results are far from the marketing... Basically, you should be looking at IOPS, not MB/s. > * SSD connected to plain SATA port strongly beats the same SSD > connected to the fancy SAS controller (mpt) Not really surprising. The controller might be too smart for its own good in this simple case. But there could be other, more important resons, like the controller disabling the drive's (in this case, SSD's) write caching, which the majority of "real" RAID controllers do by default. Leaving the drive's write cache turned on puts your data at risk (which is important if you are running servers). You can sort of verify this hypothesis by setting this loader tunable: hw.ata.wc=0 - this should disable the disk write cache for (S)ATA drives. From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 01:05:15 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E1EA1065674; Tue, 9 Nov 2010 01:05:15 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83]) by mx1.freebsd.org (Postfix) with ESMTP id 3E8D68FC37; Tue, 9 Nov 2010 01:05:15 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by smtp.llnl.gov with ESMTP; 08 Nov 2010 17:05:14 -0800 Message-ID: <4CD89E4A.6000902@llnl.gov> Date: Mon, 08 Nov 2010 17:05:14 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CD84258.6090404@llnl.gov> <20101108190640.GA15661@icarus.home.lan> <4CD84B63.4030800@llnl.gov> <20101108192950.GA15902@icarus.home.lan> In-Reply-To: <20101108192950.GA15902@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: "freebsd-fs@freebsd.org" , "pjd@freebsd.org" Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 01:05:15 -0000 On 11/08/2010 11:29 AM, Jeremy Chadwick wrote: > On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote: >> On 11/08/2010 11:06 AM, Jeremy Chadwick wrote: >>> On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote: >>>> I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) >>>> volumes together. >>>> >>>> Here is a quick rundown of the hardware: >>>> * HP DL180 G6 w/12GB ram >>>> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) >>>> * Winchester Hardware SAN, >>>> >>>> da2 at isp0 bus 0 scbus2 target 0 lun 0 >>>> da2: Fixed Direct Access SCSI-5 device >>>> da2: 800.000MB/s transfers >>>> da2: Command Queueing enabled >>>> da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) >>>> >>> >> The server is in a data center with limited access control, do I >> have to option of using a particular CVS tag (checking out via csup) >> and then perform a make world/kernel? > Doing this is more painful than, say, downloading a livefs image and > seeing if you can reproduce the problem (e.g. you won't be modifying > your existing OS installation), especially since I can't guarantee that > the problem you're seeing is fixed in RELENG_8 (hence my request to > begin with). But if you can't boot livefs, then here you go: > > You'll need some form of console access (either serial or VGA) to do the > upgrade reliably. "Rolling back" may also not be an option since > RELENG_8 is newer than RELENG_8_1 and may have introduced some new > binaries or executables into the fray. If you don't have console access > to this machine, if things go awry you may be SOL. The vagueness of my > statement is intentional; I can't cover every situation that might come > to light. > > Please be sure to back up your kernel configuration file before doing > the following, and make sure that the supfile shown below has > tag=RELENG_8 in it (it should). And yes, the rm commands below are > recommended; failure to use them could result in some oddities given > that your /usr/src tree refers to RELENG_8_1 version numbers which > differ from RELENG_8. You *do not* have to do this for ports (since for > ports, tag=. is used by default). > > rm -fr /var/db/sup/src-all > rm -fr /usr/src/* > rm -fr /usr/obj/* > csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile > > At this point you can restore your kernel configuration file to the > appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build > world/kernel as per the instructions in /usr/src/Makefile (see lines > ~51-62). ***Please do not skip any of the steps***. Good luck. > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://BLOCKEDwww.BLOCKEDparodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > I wasn't able to make it to the Data Center to boot off of a USB/CD, but I did follow your steps to upgrade to RELENG_8. So far, things are stable: write# uname -a FreeBSD write.llnl.gov 8.1-STABLE FreeBSD 8.1-STABLE #0: Mon Nov 8 16:38:06 PST 2010 root@write.llnl.gov:/usr/obj/usr/src/sys/GENERIC amd64 write# kldstat Id Refs Address Size Name 1 15 0xffffffff80100000 d86d18 kernel 2 1 0xffffffff80e87000 f058 aio.ko 3 1 0xffffffff80e97000 16ea40 ispfw.ko 4 1 0xffffffff81006000 5568 geom_multipath.ko 5 1 0xffffffff81222000 104ac5 zfs.ko 6 1 0xffffffff81327000 1a15 opensolaris.ko write# zpool create test01 da2 da3 da4 da5 da6 da7 da8 write# zpool status write# cd /tmp write# clear write# cp random.dat.1 /test01/ write# cp random.dat.1 /test01/random.dat.2 write# cp random.dat.1 /test01/random.dat.3 write# cp random.dat.1 /test01/random.dat.4 write# cp random.dat.1 /test01/random.dat.5 write# cp random.dat.1 /test01/random.dat.6 write# md5 random.dat.1 MD5 (random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 write# md5 /test01/random.dat.* MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36 write# zpool status pool: test01 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# zpool scrub test01 write# zpool status pool: test01 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 17:00:01 2010 config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors Any ideas for further testing to narrow down the culprit? Oh, one other thing that I modified was /boot/loader.conf. I had previously limited the vfs.zfs.arc_max to 1024M, so I had also commented that out. Thanks again, I'm going to continue writing files and scrubbing the array until I have a level of confidence with the file system. Mike C From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 11:23:20 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9B4C01065695 for ; Tue, 9 Nov 2010 11:23:20 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 51C7C8FC14 for ; Tue, 9 Nov 2010 11:23:20 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PFmII-0001HT-VD for freebsd-fs@freebsd.org; Tue, 09 Nov 2010 12:23:18 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 09 Nov 2010 12:23:18 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 09 Nov 2010 12:23:18 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Tue, 09 Nov 2010 12:23:04 +0100 Lines: 13 Message-ID: References: <4CD84258.6090404@llnl.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <4CD84258.6090404@llnl.gov> X-Enigmail-Version: 1.1.2 Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 11:23:20 -0000 On 11/08/10 19:32, Mike Carlson wrote: > As soon as I create the volume and write data to it, it is reported as > being corrupted: > > write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 > However, if I create a 'raidz' volume, no errors occur: A very interesting problem. Can you check with some other kind of volume manager that striping the data doesn't cause some unusual hardware interaction? Can you try, as an experiment, striping them all with gstripe (but you'll have to use a small stripe size like 16 KiB or 8 KiB)? From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 13:13:49 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 926D2106566C for ; Tue, 9 Nov 2010 13:13:49 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id ECC868FC18 for ; Tue, 9 Nov 2010 13:13:48 +0000 (UTC) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA9DDVDT077097; Tue, 9 Nov 2010 14:13:47 +0100 (CET) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA9DDUoc077095; Tue, 9 Nov 2010 14:13:30 +0100 (CET) (envelope-from olli) Date: Tue, 9 Nov 2010 14:13:30 +0100 (CET) Message-Id: <201011091313.oA9DDUoc077095@lurza.secnetix.de> From: Oliver Fromme To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com, brde@optusnet.com.au In-Reply-To: <20101109065842.R2343@besplex.bde.org> X-Newsgroups: list.freebsd-fs User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (FreeBSD/6.4-PRERELEASE-20080904 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5 (lurza.secnetix.de [127.0.0.1]); Tue, 09 Nov 2010 14:13:47 +0100 (CET) Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com, brde@optusnet.com.au List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 13:13:49 -0000 Bruce Evans wrote: > On Mon, 8 Nov 2010, Oliver Fromme wrote: > > Monthadar Al Jaberi wrote: > > > [...] > > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size > > > 8192: Invalid argument > > > > > > So I guessed it has todo with the flash configured in 64k sectors > > > according to the boot output. > > > ... > > > mx25l0: at cs 0 on spibus0 > > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors > > > ... > > > > Historically UFS/FFS supports only 512 bytes per sector. > > I think it was patched at some point in the past to support > > 2048 bytes per sector, too, which is used by some MOD media > > and DVD-RAM. I'm pretty sure it does _not_ support 65536 > > bytes per sector (someone please correct me if I'm wrong). > > Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really > even use sectors. It just has a buggy superblock probe which > prevents it determining its correct i/o size when that size > exceeds SBLOCKSIZE = 8192. In the second half of the 90s it did *not* support the 2048-byte sectors of the larger MOD media that became popular at that time. I owned several of those drives (still have one of them), so I remember it quite well. FreeBSD's file system code needed some patches in order to be able to use those disks. Before that, only 512- byte sectors worked. I don't know what sector sizes are supported today, but I wouldn't be surprised if only 512 to 2048 works out of the box. I'm not aware of any widely used media that has sectors smaller than 512 or larger than 2048. (Those new 4k drives translate accesses to/from 512 byte sectors, so it looks like a 512-byte sector drive.) Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd Python is executable pseudocode. Perl is executable line noise. From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 13:29:16 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16E84106566C for ; Tue, 9 Nov 2010 13:29:16 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id C2CBA8FC15 for ; Tue, 9 Nov 2010 13:29:15 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PFoFm-0007jp-CW for freebsd-fs@freebsd.org; Tue, 09 Nov 2010 14:28:50 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 09 Nov 2010 14:28:50 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 09 Nov 2010 14:28:50 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Tue, 09 Nov 2010 14:28:10 +0100 Lines: 18 Message-ID: References: <20101109065842.R2343@besplex.bde.org> <201011091313.oA9DDUoc077095@lurza.secnetix.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <201011091313.oA9DDUoc077095@lurza.secnetix.de> X-Enigmail-Version: 1.1.2 Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 13:29:16 -0000 On 11/09/10 14:13, Oliver Fromme wrote: > I don't know what sector sizes are supported today, but > I wouldn't be surprised if only 512 to 2048 works out > of the box. I'm not aware of any widely used media > that has sectors smaller than 512 or larger than 2048. > (Those new 4k drives translate accesses to/from 512 byte > sectors, so it looks like a 512-byte sector drive.) Can't say much about the Olden Times, but now it's trivial to show that UFS, in fact, works ok with various sector sizes using gnop. I've tested it with at least 4 KiB sectors recently, and I think I remember trying (successfully) 8 KiB sectors a few years ago in a project, also successfully. What might or might not work well, I think, is having fragment/block ratio other then "8". From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 13:59:35 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9D111106564A; Tue, 9 Nov 2010 13:59:35 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id 1FDB58FC17; Tue, 9 Nov 2010 13:59:34 +0000 (UTC) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA9DxIuw079253; Tue, 9 Nov 2010 14:59:33 +0100 (CET) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA9DxI3U079252; Tue, 9 Nov 2010 14:59:18 +0100 (CET) (envelope-from olli) Date: Tue, 9 Nov 2010 14:59:18 +0100 (CET) Message-Id: <201011091359.oA9DxI3U079252@lurza.secnetix.de> From: Oliver Fromme To: freebsd-fs@FreeBSD.ORG, ivoras@FreeBSD.ORG In-Reply-To: X-Newsgroups: list.freebsd-fs User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (FreeBSD/6.4-PRERELEASE-20080904 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5 (lurza.secnetix.de [127.0.0.1]); Tue, 09 Nov 2010 14:59:34 +0100 (CET) Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-fs@FreeBSD.ORG, ivoras@FreeBSD.ORG List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 13:59:35 -0000 Ivan Voras wrote: > What might or might not work well, I think, is having fragment/block > ratio other then "8". I've also successfully used fsize == bsize (i.e. ratio 1). But I also think that ratios other than 1 and 8 will not work well. Another question is whether other FS-related tools like fsck(8) and dump(8) work well with unusual file systems. Having FFS support for, say, 65536 byte sectors won't buy you much if fsck can't reliably handle it. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 14:16:39 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD3041065673 for ; Tue, 9 Nov 2010 14:16:39 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 56AFE8FC1F for ; Tue, 9 Nov 2010 14:16:39 +0000 (UTC) Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au (c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oA9EGaJC026668 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 10 Nov 2010 01:16:37 +1100 Date: Wed, 10 Nov 2010 01:16:36 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com, brde@optusnet.com.au In-Reply-To: <201011091313.oA9DDUoc077095@lurza.secnetix.de> Message-ID: <20101110004642.H1101@besplex.bde.org> References: <201011091313.oA9DDUoc077095@lurza.secnetix.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 14:16:39 -0000 On Tue, 9 Nov 2010, Oliver Fromme wrote: > Bruce Evans wrote: > > On Mon, 8 Nov 2010, Oliver Fromme wrote: > > > Monthadar Al Jaberi wrote: > > > > [...] > > > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size > > > > 8192: Invalid argument > > > > > > > > So I guessed it has todo with the flash configured in 64k sectors > > > > according to the boot output. > > > > ... > > > > mx25l0: at cs 0 on spibus0 > > > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors > > > > ... > > > > > > Historically UFS/FFS supports only 512 bytes per sector. > > > I think it was patched at some point in the past to support > > > 2048 bytes per sector, too, which is used by some MOD media > > > and DVD-RAM. I'm pretty sure it does _not_ support 65536 > > > bytes per sector (someone please correct me if I'm wrong). > > > > Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really > > even use sectors. It just has a buggy superblock probe which > > prevents it determining its correct i/o size when that size > > exceeds SBLOCKSIZE = 8192. > > In the second half of the 90s it did *not* support the > 2048-byte sectors of the larger MOD media that became > popular at that time. I owned several of those drives > (still have one of them), so I remember it quite well. > FreeBSD's file system code needed some patches in order > to be able to use those disks. Before that, only 512- > byte sectors worked. That's strange, since in the initial slice code in 1994 or 1995, I emulated 4K-sectors in uncommitted patches in the floppy driver and thought I tested ffs with them. > I don't know what sector sizes are supported today, but > I wouldn't be surprised if only 512 to 2048 works out > of the box. I'm not aware of any widely used media > that has sectors smaller than 512 or larger than 2048. > (Those new 4k drives translate accesses to/from 512 byte > sectors, so it looks like a 512-byte sector drive.) I have a DVD drive that only supports writing 32K-blocks on DVD-R. In 2005 I gave up trying to get ffs to work on this. The drive supports reading the usual 2K-blocks, so the ffs probe worked, and the ffs block size just needed to be set to 32K or 64K so that writes worked too. But this block size wastes a lot of space and time for small files. FreeBSD's buffering is bad for read-mostly media, and I never found any file system that works well for small files on DVDs or CDROMs (images that can be written in 10 minutes take more like 10 hours to read back if they contain a few hundred thousand small files). mdconfig allows any representable sector size except 0 and possibly non- power-of-2 ones (mdconfig(8) uses strtoul() with null error handling; md(4) checks for a power of 2 in the malloc-backed case but has null arg checking in other cases (if other cases are reached now -- this feature used to be limited to the malloc-backed case)). Power of 2 sizes that cannot work because they exceed MAXBSIZE can certainly be configured. Bruce From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 15:03:50 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DA62D1065673 for ; Tue, 9 Nov 2010 15:03:50 +0000 (UTC) (envelope-from monthadar@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id A64AA8FC0C for ; Tue, 9 Nov 2010 15:03:50 +0000 (UTC) Received: by pwi10 with SMTP id 10so217601pwi.13 for ; Tue, 09 Nov 2010 07:03:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=GWSu1ZlD4YFGZ8BW0It1eIYC8Jrc3KpgCgW9rLajbrc=; b=ZUdpHsb13HYUncHUFil2S4Ljqzdc744c/XyuGol2blQ8pX87S4Oh+wCGY8c2XqAG2K j1z1QPVxG7DvnwSpjjrIpAAZ/iChEPxKRn1Bp0HJJJKF0TXkPRjQmM8ltvhUX7LZdEL+ BihRJYa0VIUPYT6HlCrrII9zt+yZ0O4R2ElIk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=whvzaoLbgVQtaltfGEpaRPEzj1ItndNzaZCO0eVFrkT1IowlSG51W+bkwGOexW4YDc 7o3KmlvYuFBcgxRqdWbTNTltyVfbE/lxN8xAvXjy/9UYouweQKJIgy4a/9urrXSrG5t3 iH5EldU0z6K9dLYnZ0w0iY/68CWX64atXB3+E= MIME-Version: 1.0 Received: by 10.229.225.199 with SMTP id it7mr6556763qcb.33.1289315029694; Tue, 09 Nov 2010 07:03:49 -0800 (PST) Received: by 10.229.182.77 with HTTP; Tue, 9 Nov 2010 07:03:49 -0800 (PST) In-Reply-To: <20101110004642.H1101@besplex.bde.org> References: <201011091313.oA9DDUoc077095@lurza.secnetix.de> <20101110004642.H1101@besplex.bde.org> Date: Tue, 9 Nov 2010 16:03:49 +0100 Message-ID: From: Monthadar Al Jaberi To: Bruce Evans , olli@lurza.secnetix.de, ivoras@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ?error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 15:03:50 -0000 Thank you for the clarifications! Definitely the right place to ask! I successfully tried mdconfig version, but gnop gives error message: gnop: Invalid secsize for provider redboot/fs. for any sectorsize that I give other than 64K, or multiples of 64K :/ RSPRO3# gnop create -s 128k -S 256k /dev/redboot/fs GEOM_NOP: Device redboot/fs.nop created. but that wont help mount... I looked into datasheet for the flash (MX25125805D) and it seems like it can work in both 64K and 4K? Confused. (http://www.macronix.com/QuickPlace/hq/PageLibrary482576EF002A2699.nsf/h_In= dex/30D2368B704F50B9482576EF002D070F/?OpenDocument&Type=3DSerial%20Flash&De= nsity=3D128Mb) Also I think the mx25 driver in freebsd is not configured correctly to my flash, Adrian Chadd had some diffs on his flash, but seems like for another slightly different flash. (http://people.freebsd.org/~adrian/rspro/) For example there is no CMD_BLOCK_ERASE_32K in datasheet for my flash. Would it help me if I changed the flash driver to work with 4K? Or do I still need to either, mdconfig, gnop or play UFS/UFS2 code (hard for me)? Basically I have a cross compiled kernel+mdroot with tinyBSD wireless configuration, zipped and stored on the flash. So I am trying to have a filesystem on the flash that will shadow changes. When I zipp it takes ~10M instead of 47M! br, On Tue, Nov 9, 2010 at 3:16 PM, Bruce Evans wrote: > On Tue, 9 Nov 2010, Oliver Fromme wrote: > >> Bruce Evans wrote: >> > On Mon, 8 Nov 2010, Oliver Fromme wrote: >> > > Monthadar Al Jaberi wrote: >> > > > [...] >> > > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock siz= e >> > > > 8192: Invalid argument >> > > > >> > > > So I guessed it has todo with the flash configured in 64k sectors >> > > > according to the boot output. >> > > > ... >> > > > mx25l0: at cs 0 on spibus0 >> > > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors >> > > > ... >> > > >> > > Historically UFS/FFS supports only 512 bytes per sector. >> > > I think it was patched at some point in the past to support >> > > 2048 bytes per sector, too, which is used by some MOD media >> > > and DVD-RAM. =A0I'm pretty sure it does _not_ support 65536 >> > > bytes per sector (someone please correct me if I'm wrong). >> > >> > Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really >> > even use sectors. =A0It just has a buggy superblock probe which >> > prevents it determining its correct i/o size when that size >> > exceeds SBLOCKSIZE =3D 8192. >> >> In the second half of the 90s it did *not* support the >> 2048-byte sectors of the larger MOD media that became >> popular at that time. =A0I owned several of those drives >> (still have one of them), so I remember it quite well. >> FreeBSD's file system code needed some patches in order >> to be able to use those disks. =A0Before that, only 512- >> byte sectors worked. > > That's strange, since in the initial slice code in 1994 or 1995, I > emulated 4K-sectors in uncommitted patches in the floppy driver and > thought I tested ffs with them. > >> I don't know what sector sizes are supported today, but >> I wouldn't be surprised if only 512 to 2048 works out >> of the box. =A0I'm not aware of any widely used media >> that has sectors smaller than 512 or larger than 2048. >> (Those new 4k drives translate accesses to/from 512 byte >> sectors, so it looks like a 512-byte sector drive.) > > I have a DVD drive that only supports writing 32K-blocks on DVD-R. =A0In > 2005 I gave up trying to get ffs to work on this. =A0The drive supports > reading the usual 2K-blocks, so the ffs probe worked, and the ffs block > size just needed to be set to 32K or 64K so that writes worked too. > But this block size wastes a lot of space and time for small files. > FreeBSD's buffering is bad for read-mostly media, and I never found > any file system that works well for small files on DVDs or CDROMs > (images that can be written in 10 minutes take more like 10 hours > to read back if they contain a few hundred thousand small files). > > mdconfig allows any representable sector size except 0 and possibly non- > power-of-2 ones (mdconfig(8) uses strtoul() with null error handling; > md(4) checks for a power of 2 in the malloc-backed case but has null > arg checking in other cases (if other cases are reached now -- this > feature used to be limited to the malloc-backed case)). =A0Power of 2 > sizes that cannot work because they exceed MAXBSIZE can certainly be > configured. > > Bruce > --=20 //Monthadar Al Jaberi From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 15:11:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40B591065693; Tue, 9 Nov 2010 15:11:46 +0000 (UTC) (envelope-from bruce@cran.org.uk) Received: from muon.cran.org.uk (muon.cran.org.uk [IPv6:2a01:348:0:15:5d59:5c40:0:1]) by mx1.freebsd.org (Postfix) with ESMTP id 6B83E8FC20; Tue, 9 Nov 2010 15:11:45 +0000 (UTC) Received: from muon.cran.org.uk (localhost [127.0.0.1]) by muon.cran.org.uk (Postfix) with ESMTP id BB53BE7208; Tue, 9 Nov 2010 15:11:44 +0000 (GMT) Received: from core.nessbank (client-82-26-212-122.pete.adsl.virginmedia.com [82.26.212.122]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by muon.cran.org.uk (Postfix) with ESMTPSA; Tue, 9 Nov 2010 15:11:43 +0000 (GMT) From: Bruce Cran To: freebsd-fs@freebsd.org Date: Tue, 9 Nov 2010 15:11:42 +0000 User-Agent: KMail/1.13.5 (FreeBSD/9.0-CURRENT; KDE/4.5.2; amd64; ; ) MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_uSW2MSQc6oN+6RL" Message-Id: <201011091511.42902.bruce@cran.org.uk> Cc: Subject: Fwd: Re: Corruption of UFS filesystems after using md(4) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 15:11:46 -0000 --Boundary-00=_uSW2MSQc6oN+6RL Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit I'm forwarding this because I guess someone from fs@ might be interested. It seems creating sparse files now causes UFS filesystems to become corrupt on CURRENT. -- Bruce Cran --Boundary-00=_uSW2MSQc6oN+6RL Content-Type: message/rfc822; name="forwarded message" Content-Transfer-Encoding: 7bit Content-Description: Peter Holm : Re: Corruption of UFS filesystems after using md(4) Content-Disposition: inline Return-Path: X-Original-To: bruce@cran.org.uk Delivered-To: brucec@muon.cran.org.uk Received: from muon.cran.org.uk (localhost [127.0.0.1]) by muon.cran.org.uk (Postfix) with ESMTP id AFD0DE7207 for ; Wed, 3 Nov 2010 07:16:07 +0000 (GMT) X-Spam-Flag: NO X-Spam-Score: -1.911 X-Spam-Level: X-Spam-Status: No, score=-1.911 tagged_above=-999 required=10 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by muon.cran.org.uk (Postfix) with ESMTP for ; Wed, 3 Nov 2010 07:16:06 +0000 (GMT) Received: from hub.freebsd.org (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id 107A6178D32; Wed, 3 Nov 2010 07:15:54 +0000 (UTC) Received: from hub.freebsd.org (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 2876C1065780; Wed, 3 Nov 2010 07:15:52 +0000 (UTC) (envelope-from owner-freebsd-current@freebsd.org) Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6AC061065670 for ; Wed, 3 Nov 2010 07:15:46 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.freebsd.org (Postfix) with SMTP id 1B4648FC17 for ; Wed, 3 Nov 2010 07:15:46 +0000 (UTC) Received: (qmail 82243 invoked from network); 3 Nov 2010 06:49:05 -0000 Received: from 93.166.52.54 (HELO x2.osted.lan) (93.166.52.54) by relay00.pair.com with SMTP; 3 Nov 2010 06:49:05 -0000 X-pair-Authenticated: 93.166.52.54 Received: from x2.osted.lan (localhost [127.0.0.1]) by x2.osted.lan (8.14.3/8.14.3) with ESMTP id oA36n5Xg039844; Wed, 3 Nov 2010 07:49:05 +0100 (CET) (envelope-from pho@x2.osted.lan) Received: (from pho@localhost) by x2.osted.lan (8.14.3/8.14.3/Submit) id oA36n5SE039843; Wed, 3 Nov 2010 07:49:05 +0100 (CET) (envelope-from pho) Date: Wed, 3 Nov 2010 07:49:04 +0100 From: Peter Holm To: Bruce Cran Message-ID: <20101103064904.GA39407@x2.osted.lan> References: <201011021912.14281.bruce@cran.org.uk> <201011021933.51052.bruce@cran.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201011021933.51052.bruce@cran.org.uk> User-Agent: Mutt/1.4.2.3i Cc: freebsd-current@freebsd.org Subject: Re: Corruption of UFS filesystems after using md(4) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: owner-freebsd-current@freebsd.org Errors-To: owner-freebsd-current@freebsd.org X-UID: 38219 X-Length: 5168 On Tue, Nov 02, 2010 at 07:33:50PM +0000, Bruce Cran wrote: > On Tuesday 02 November 2010 19:12:14 Bruce Cran wrote: > > I've noticed in recent months that I appear to be getting silent corruption > > of my UFS filesystems - and I think it may be linked to using md(4) or > > creating sparse files. > > I've confirmed this is a UFS bug related to sparse files: "truncate -s20G f1 > && rm f1" is enough to trigger the error and start generating .viminfo files > that appear to be 20GB. When running fsck I get an "Invalid block count" error > if I just reboot without removing the .viminfo file; if I do remove it, I get > a "Partially allocated inode" error. > I'm able to verify this by: "m.sh" 49L, 1917C written $ ./m.sh Local config: x4 + mdconfig -a -t swap -s 1g -u 5 + bsdlabel -w md5 auto + newfs -U md5a + mount /dev/md5a /mnt + truncate -s20G /mnt/f1 + rm /mnt/f1 + umount /mnt + fsck -t ufs -y /dev/md5a ** /dev/md5a ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes PARTIALLY ALLOCATED INODE I=4 UNEXPECTED SOFT UPDATE INCONSISTENCY CLEAR? yes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? yes SUMMARY INFORMATION BAD SALVAGE? yes BLK(S) MISSING IN BIT MAPS SALVAGE? yes 2 files, 2 used, 506481 free (25 frags, 63307 blocks, 0.0% fragmentation) ***** FILE SYSTEM IS CLEAN ***** ***** FILE SYSTEM WAS MODIFIED ***** + mdconfig -d -u 5 $ - Peter _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --Boundary-00=_uSW2MSQc6oN+6RL-- From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 16:55:31 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 201EE1065672 for ; Tue, 9 Nov 2010 16:55:31 +0000 (UTC) (envelope-from v.velox@vvelox.net) Received: from vulpes.vvelox.net (sula-ki.vvelox.net [99.69.115.46]) by mx1.freebsd.org (Postfix) with ESMTP id D9FD28FC1D for ; Tue, 9 Nov 2010 16:55:30 +0000 (UTC) Received: from vixen42.vulpes.vvelox.net (unknown [192.168.14.2]) (Authenticated sender: v.velox) by vulpes.vvelox.net (Postfix) with ESMTPA id 230F2B885 for ; Tue, 9 Nov 2010 10:22:18 -0600 (CST) Date: Tue, 9 Nov 2010 10:17:40 -0600 From: "Zane C.B." To: freebsd-fs@freebsd.org Message-ID: <20101109101740.2e8e80ce@vixen42.vulpes.vvelox.net> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; amd64-portbld-freebsd8.1) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/dMzSZwx/X+f8WKtpRC_qITB"; protocol="application/pgp-signature" Subject: NFS locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 16:55:31 -0000 --Sig_/dMzSZwx/X+f8WKtpRC_qITB Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable What does it take to get NFS locking working? Any time I start lockd on the server, I get the message below in dmesg. NLM: failed to contact remote rpcbind, stat =3D 0, port =3D 0 NLM: failed to contact remote rpcbind, stat =3D 0, port =3D 0 Can't start NLM - unable to contact NSM Any ideas? --Sig_/dMzSZwx/X+f8WKtpRC_qITB Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iEYEARECAAYFAkzZdC4ACgkQqrJJy0yxYQDGXgCdHb52sCUd6LH79+PnaByWhtdH VtkAn0QzUR41d7rp6Fgg1/uKlnXFQoxn =etQ9 -----END PGP SIGNATURE----- --Sig_/dMzSZwx/X+f8WKtpRC_qITB-- From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 17:01:06 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC7D31065675 for ; Tue, 9 Nov 2010 17:01:06 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta07.emeryville.ca.mail.comcast.net (qmta07.emeryville.ca.mail.comcast.net [76.96.30.64]) by mx1.freebsd.org (Postfix) with ESMTP id C1C558FC1D for ; Tue, 9 Nov 2010 17:01:06 +0000 (UTC) Received: from omta07.emeryville.ca.mail.comcast.net ([76.96.30.59]) by qmta07.emeryville.ca.mail.comcast.net with comcast id V3rl1f0041GXsucA7516s0; Tue, 09 Nov 2010 17:01:06 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta07.emeryville.ca.mail.comcast.net with comcast id V5141f00R3LrwQ28U515Sn; Tue, 09 Nov 2010 17:01:05 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id A14D39B427; Tue, 9 Nov 2010 09:01:04 -0800 (PST) Date: Tue, 9 Nov 2010 09:01:04 -0800 From: Jeremy Chadwick To: "Zane C.B." Message-ID: <20101109170104.GA37882@icarus.home.lan> References: <20101109101740.2e8e80ce@vixen42.vulpes.vvelox.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101109101740.2e8e80ce@vixen42.vulpes.vvelox.net> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: NFS locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 17:01:06 -0000 On Tue, Nov 09, 2010 at 10:17:40AM -0600, Zane C.B. wrote: > What does it take to get NFS locking working? > > Any time I start lockd on the server, I get the message below in > dmesg. > > NLM: failed to contact remote rpcbind, stat = 0, port = 0 > NLM: failed to contact remote rpcbind, stat = 0, port = 0 > Can't start NLM - unable to contact NSM > > Any ideas? http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-03/msg00484.html Solution (for me): http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/056043.html -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 17:27:07 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1BD841065693 for ; Tue, 9 Nov 2010 17:27:07 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id 8C68D8FC0A for ; Tue, 9 Nov 2010 17:27:06 +0000 (UTC) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA9HQoXl088175; Tue, 9 Nov 2010 18:27:05 +0100 (CET) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA9HQngL088174; Tue, 9 Nov 2010 18:26:49 +0100 (CET) (envelope-from olli) Date: Tue, 9 Nov 2010 18:26:49 +0100 (CET) Message-Id: <201011091726.oA9HQngL088174@lurza.secnetix.de> From: Oliver Fromme To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com In-Reply-To: X-Newsgroups: list.freebsd-fs User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (FreeBSD/6.4-PRERELEASE-20080904 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5 (lurza.secnetix.de [127.0.0.1]); Tue, 09 Nov 2010 18:27:05 +0100 (CET) Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ??error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 17:27:07 -0000 Monthadar Al Jaberi wrote: > I successfully tried mdconfig version, but gnop gives error message: > gnop: Invalid secsize for provider redboot/fs. I'm sorry, I shouldn't have mentioned gnop. It's probably not helpful in this case. > I looked into datasheet for the flash (MX25125805D) and it seems like > it can work in both 64K and 4K? Confused. > (http://www.macronix.com/QuickPlace/hq/PageLibrary482576EF002A2699.nsf/h_Index/30D2368B704F50B9482576EF002D070F/?OpenDocument&Type=Serial%20Flash&Density=128Mb) I'm not familiar with that kind of hardware, but if it can be switched to 4k sector mode, then that would make things a lot easier. > Would it help me if I changed the flash driver to work with 4K? Yes, definitely. > Or do I still need to either, mdconfig, gnop or play UFS/UFS2 code > (hard for me)? If everything else fails, I would simply create a memory disk with mdconfig (if you have enough RAM), copy the file system from flash to the memory disk (use "dd bs=64k ...") and mount it from there. That's not hard. > Basically I have a cross compiled kernel+mdroot with tinyBSD wireless > configuration, zipped and stored on the flash. So I am trying to have > a filesystem on the flash that will shadow changes. > > When I zipp it takes ~10M instead of 47M! Wait a second ... I don't understand ... Are you saying that you've put a compressed FS image on the flash? Is that the file system that you're trying to mount? Or are we talking about two distinct pieces of flash? If you just want to save changes (e.g. configuration files, log files and similar) to the flash, you can also simply use a tar archive: # tar -cb 128 -f /dev/ Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "The last good thing written in C was Franz Schubert's Symphony number 9." -- Erwin Dieterich From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 17:37:32 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C40D106564A for ; Tue, 9 Nov 2010 17:37:32 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83]) by mx1.freebsd.org (Postfix) with ESMTP id 765E78FC19 for ; Tue, 9 Nov 2010 17:37:32 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by smtp.llnl.gov with ESMTP; 09 Nov 2010 09:37:31 -0800 Message-ID: <4CD986DC.1070401@llnl.gov> Date: Tue, 09 Nov 2010 09:37:32 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4CD84258.6090404@llnl.gov> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 17:37:32 -0000 On 11/09/2010 03:23 AM, Ivan Voras wrote: > On 11/08/10 19:32, Mike Carlson wrote: > >> As soon as I create the volume and write data to it, it is reported as >> being corrupted: >> >> write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 >> However, if I create a 'raidz' volume, no errors occur: > A very interesting problem. Can you check with some other kind of volume > manager that striping the data doesn't cause some unusual hardware > interaction? Can you try, as an experiment, striping them all with > gstripe (but you'll have to use a small stripe size like 16 KiB or 8 KiB)? > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://BLOCKEDlists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Sure: write# gstripe label -v -s 16384 data /dev/da2 /dev/da3 /dev/da4 /dev/da5 /dev/da6 /dev/da7 /dev/da8 Metadata value stored on /dev/da2. Metadata value stored on /dev/da3. Metadata value stored on /dev/da4. Metadata value stored on /dev/da5. Metadata value stored on /dev/da6. Metadata value stored on /dev/da7. Metadata value stored on /dev/da8. Done. write# newfs -O2 -U /dev/stripe/data /dev/stripe/data: 133522760.0MB (273454612256 sectors) block size 16384, fragment size 2048 using 627697 cylinder groups of 212.72MB, 13614 blks, 6848 inodes. with soft updates super-block backups (for fsck -b #) at: ... write# mount /dev/stripe/data /mnt write# df -h Filesystem Size Used Avail Capacity Mounted on /dev/da0s1a 1.7T 22G 1.6T 1% / devfs 1.0K 1.0K 0B 100% /dev /dev/stripe/data 126T 4.0K 116T 0% /mnt write# cd /tmp write# md5 /tmp/random.dat.1 MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 write# cp random.dat.1 /mnt/ write# cp /mnt/random.dat.1 /mnt/random.dat.2 write# cp /mnt/random.dat.1 /mnt/random.dat.3 write# cp /mnt/random.dat.1 /mnt/random.dat.4 write# cp /mnt/random.dat.1 /mnt/random.dat.5 write# cp /mnt/random.dat.1 /mnt/random.dat.6 write# md5 /mnt/* MD5 (/mnt/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/mnt/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/mnt/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/mnt/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/mnt/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/mnt/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36 write# fsck /mnt fsck: Could not determine filesystem type write# fsck_ufs /mnt ** /dev/stripe/data (NO WRITE) ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes Segmentation fault So, the data appears to be okay, I wanted to run through a FSCK just to do it but that seg faulted. Otherwise, that data looks good. Question, why did you recommend using a smaller stripe size? Is that to ensure a sample 1GB test file gets written across ALL disk members? From owner-freebsd-fs@FreeBSD.ORG Tue Nov 9 17:42:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC0851065670 for ; Tue, 9 Nov 2010 17:42:46 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83]) by mx1.freebsd.org (Postfix) with ESMTP id 8B1518FC1B for ; Tue, 9 Nov 2010 17:42:46 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by smtp.llnl.gov with ESMTP; 09 Nov 2010 09:42:46 -0800 Message-ID: <4CD98816.1020306@llnl.gov> Date: Tue, 09 Nov 2010 09:42:46 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4CD84258.6090404@llnl.gov> <4CD986DC.1070401@llnl.gov> In-Reply-To: <4CD986DC.1070401@llnl.gov> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2010 17:42:46 -0000 On 11/09/2010 09:37 AM, Mike Carlson wrote: > On 11/09/2010 03:23 AM, Ivan Voras wrote: >> On 11/08/10 19:32, Mike Carlson wrote: >> >>> As soon as I create the volume and write data to it, it is reported as >>> being corrupted: >>> >>> write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 >>> However, if I create a 'raidz' volume, no errors occur: >> A very interesting problem. Can you check with some other kind of volume >> manager that striping the data doesn't cause some unusual hardware >> interaction? Can you try, as an experiment, striping them all with >> gstripe (but you'll have to use a small stripe size like 16 KiB or 8 KiB)? >> >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://BLOCKEDBLOCKEDlists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > Sure: > > write# gstripe label -v -s 16384 data /dev/da2 /dev/da3 /dev/da4 > /dev/da5 /dev/da6 /dev/da7 /dev/da8 > Metadata value stored on /dev/da2. > Metadata value stored on /dev/da3. > Metadata value stored on /dev/da4. > Metadata value stored on /dev/da5. > Metadata value stored on /dev/da6. > Metadata value stored on /dev/da7. > Metadata value stored on /dev/da8. > Done. > write# newfs -O2 -U /dev/stripe/data > /dev/stripe/data: 133522760.0MB (273454612256 sectors) block size > 16384, fragment size 2048 > using 627697 cylinder groups of 212.72MB, 13614 blks, 6848 inodes. > with soft updates > super-block backups (for fsck -b #) at: > ... > write# mount /dev/stripe/data /mnt > write# df -h > Filesystem Size Used Avail Capacity Mounted on > /dev/da0s1a 1.7T 22G 1.6T 1% / > devfs 1.0K 1.0K 0B 100% /dev > /dev/stripe/data 126T 4.0K 116T 0% /mnt > write# cd /tmp > write# md5 /tmp/random.dat.1 > MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > write# cp random.dat.1 /mnt/ > write# cp /mnt/random.dat.1 /mnt/random.dat.2 > write# cp /mnt/random.dat.1 /mnt/random.dat.3 > write# cp /mnt/random.dat.1 /mnt/random.dat.4 > write# cp /mnt/random.dat.1 /mnt/random.dat.5 > write# cp /mnt/random.dat.1 /mnt/random.dat.6 > write# md5 /mnt/* > MD5 (/mnt/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/mnt/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/mnt/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/mnt/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/mnt/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36 > MD5 (/mnt/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36 > write# fsck /mnt > fsck: Could not determine filesystem type > write# fsck_ufs /mnt > ** /dev/stripe/data (NO WRITE) > ** Last Mounted on /mnt > ** Phase 1 - Check Blocks and Sizes > Segmentation fault > > So, the data appears to be okay, I wanted to run through a FSCK just to > do it but that seg faulted. Otherwise, that data looks good. > > Question, why did you recommend using a smaller stripe size? Is that to > ensure a sample 1GB test file gets written across ALL disk members? > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://BLOCKEDlists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Oh, I almost forgot, here is the ZFS version of that gstripe array: write# zpool create test01 /dev/stripe/data write# cp /tmp/random.dat.1 /test01/ write# cp /test01/random.dat.1 /test01/random.dat.2 write# cp /test01/random.dat.1 /test01/random.dat.3 write# cp /test01/random.dat.1 /test01/random.dat.4 write# cp /test01/random.dat.1 /test01/random.dat.5 write# cp /test01/random.dat.1 /test01/random.dat.6 write# md5 /test01/* MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test01/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36 write# zpool scrub write# zpool status pool: test01 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Tue Nov 9 09:41:34 2010 config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 stripe/data ONLINE 0 0 0 errors: No known data errors Again, no errors. From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 00:20:08 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9B1E106564A for ; Wed, 10 Nov 2010 00:20:08 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx23.fluidhosting.com [204.14.89.6]) by mx1.freebsd.org (Postfix) with ESMTP id 785BF8FC20 for ; Wed, 10 Nov 2010 00:20:07 +0000 (UTC) Received: (qmail 26517 invoked by uid 399); 10 Nov 2010 00:20:07 -0000 Received: from localhost (HELO doug-optiplex.ka9q.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTPAM; 10 Nov 2010 00:20:07 -0000 X-Originating-IP: 127.0.0.1 X-Sender: dougb@dougbarton.us Message-ID: <4CD9E535.8000801@FreeBSD.org> Date: Tue, 09 Nov 2010 16:20:05 -0800 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101028 Thunderbird/3.1.6 MIME-Version: 1.0 To: Aditya Sarawgi References: <20100929031825.L683@besplex.bde.org> <20100929084801.M948@besplex.bde.org> <20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org> <20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org> <4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth> In-Reply-To: <20101108174327.GC2066@earth> X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: fs@freebsd.org Subject: Re: ext2fs now extremely slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 00:20:08 -0000 On 11/08/2010 09:43, Aditya Sarawgi wrote: > On Wed, Nov 03, 2010 at 05:43:26PM -0700, Doug Barton wrote: >> Regarding stability, sometimes (but not always) when I'm doing the above >> listed disk-intensive things on an otherwise idle system I've had the >> system lock up. Not panic, not reboot, just wedge. I'm running X when >> this happens, so I'm not 100% sure that the disk activity is the >> culprit, but it seems very suspicious. Yesterday was a very bad day, I >> had to do 3 tries to get all the way through a buildworld/kernel, mostly >> because the last 2 crashes resulted in my /usr/src (which is actually >> /home/svn/head) and /usr/obj (/home/obj-9) directories getting corrupted >> respectively. Today (running r214694) has actually been quite good, >> although I haven't tried a buildworld yet. >> > > I am not sure if this is the right use case for ext2fs Can you expand on that? What about it do you see as problematic? >>> You can test Zheng's preallocation patch for ext2fs, there is a >>> serious lack of testers for that. >> >> I would be happy to do that, but my reading of this thread last month >> didn't produce a clear "try this version of the patch" neon sign. >> Various people referred to suggestions, updates, etc. If someone could >> provide a URL for the right patch to try, as well as a suggestion for >> benchmarking methodology, I'll be glad to do so. >> > > I have attached the patch. Thanks for that. I'm curious though whether this is the latest version of the patch with the suggested improvements from earlier in this thread? > Some primitive testing like copying files, > untaring etc and comparing with the existing ext2fs will do. If you > are looking to do a full fledged benchmarking then I would suggest > iozone, blogbench, dbench etc. Sorry, I am not a filesystem person, so if you want me to do any real benchmarking you're going to have to give me details ... Install this program, run this test, etc. Meanwhile I finally got around to setting up my 8.1-RELEASE partition on this same system and particularly with cvsup it's very noticeable that ext2fs in -current is MUCH slower than in RELENG_8. I'll do some before and after tests on -current, then I'll do the same thing on 8.1 and see how the numbers compare. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 06:34:08 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A5303106566B; Wed, 10 Nov 2010 06:34:08 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 892F28FC0C; Wed, 10 Nov 2010 06:34:07 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id IAA10495; Wed, 10 Nov 2010 08:34:06 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PG4Fx-000Agy-QS; Wed, 10 Nov 2010 08:34:05 +0200 Message-ID: <4CDA3CDD.5000404@freebsd.org> Date: Wed, 10 Nov 2010 08:34:05 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Ivan Voras References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> In-Reply-To: <4CD7E960.1070200@freebsd.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 06:34:08 -0000 on 08/11/2010 14:13 Andriy Gapon said the following: > on 08/11/2010 13:55 Andriy Gapon said the following: >> I reliable got this panic when all I was doing is saving an attachment in >> thunderbird 3 that ran in KDE 4 environment. Not sure what was going on behind >> the scenes, but shouldn't have been anything out of the ordinary. > > Perhaps this is my local mistake. I can't see from code and crash dump how NULL > pointer is possible there. So perhaps I have some ABI mismatch between kernel > and fuse module. > I will rebuild fuse kmod and re-test again. Yes, the rebuild has helped. I wish this could be nicely automated. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 07:44:38 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8978E106564A for ; Wed, 10 Nov 2010 07:44:38 +0000 (UTC) (envelope-from monthadar@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 408888FC1B for ; Wed, 10 Nov 2010 07:44:37 +0000 (UTC) Received: by qwg8 with SMTP id 8so394654qwg.13 for ; Tue, 09 Nov 2010 23:44:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=FVKfQrOebaQb6qQ0NBhIEsDJCsuXt2IsH5SVO21Nu6E=; b=YH2hYWphPwiagS1fUUS3xD1kqMoFiu8UP4WD+sto9mGvZGJhwjrfJQyePeuidbspP2 WdmizMSI9qUIT7PTmC42zgxrjsf1F/YXSoKeoHUgEkMggIZqnwsD+LLSKdpqdG1onR04 ZGfpXputK8bkEW6WLdPKHZWIw92YZGZ22zf4c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=bjeXSjO5aiUPRo/MtC6G3xCv/ku2UGtaYTvCq5mJBS7k0ukDqQEPn8tZVHVs7EaMaX Vh47jZ16u7X7Vyq1ptiEY4wOwGbThsR1dC+p3Ryr8MbZXJhePdfZfRtkrblWgOtfUJ/f sqFe2zG9lS65qla55liED70qh7qbDfqX567xk= MIME-Version: 1.0 Received: by 10.224.193.201 with SMTP id dv9mr6212274qab.125.1289375077306; Tue, 09 Nov 2010 23:44:37 -0800 (PST) Received: by 10.229.182.77 with HTTP; Tue, 9 Nov 2010 23:44:37 -0800 (PST) In-Reply-To: <201011091726.oA9HQngL088174@lurza.secnetix.de> References: <201011091726.oA9HQngL088174@lurza.secnetix.de> Date: Wed, 10 Nov 2010 08:44:37 +0100 Message-ID: From: Monthadar Al Jaberi To: freebsd-fs@freebsd.org, olli@lurza.secnetix.de Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done() ??error=22] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 07:44:38 -0000 > =A0> Would it help me if I changed the flash driver to work with 4K? > > Yes, definitely. Ok, I will look into. > > =A0> Or do I still need to either, mdconfig, gnop or play UFS/UFS2 code > =A0> (hard for me)? > > If everything else fails, I would simply create a memory > disk with mdconfig (if you have enough RAM), copy the file > system from flash to the memory disk (use "dd bs=3D64k ...") > and mount it from there. =A0That's not hard. Yepp, works like a charm, Thank you =3D) My board have 128MB ram and 16MB flash. > > =A0> Basically I have a cross compiled kernel+mdroot with tinyBSD wireles= s > =A0> configuration, zipped and stored on the flash. So I am trying to hav= e > =A0> a filesystem on the flash that will shadow changes. > =A0> > =A0> When I zipp it takes ~10M instead of 47M! > > Wait a second ... =A0I don't understand ... =A0Are you saying > that you've put a compressed FS image on the flash? =A0Is > that the file system that you're trying to mount? =A0Or are > we talking about two distinct pieces of flash? No its another filesystem that I want to mount. The root filesystem is inside the kernel image (MD_ROOT option) , the kernel will mount from it (/dev/md0) When the kernel image was generated I zipped it and stored it on the flash, so a bootloader like Redboot will unzipp to ram and run the kernel. So I guess its messy to touch the filesystem that is inside a zipped kernel image. Filesystem itself is not zipped. So idea is to create another directory in the flash FIS (Flash Image System) where I store new versions of somefiles. And your tar advice works nicely, that makes it easier than having a filesystem on flash and it will be zipped too!!!!! Thank you :) Maybe no need to fidle with the flash driver and UFS code, who knows... :P I guess ideal scenario is having a direct read/write separate filesystem for some paths like /etc on the flash, while /var and /tmp are mounted as /dev/mdX and the rest is read-only. Thank you! > > -- > Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. > Handelsregister: Registergericht Muenchen, HRA 74606, =A0Gesch=E4ftsfuehr= ung: > secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M=FC= n- > chen, HRB 125758, =A0Gesch=E4ftsf=FChrer: Maik Bachmann, Olaf Erb, Ralf G= ebhart > > FreeBSD-Dienstleistungen, -Produkte und mehr: =A0http://www.secnetix.de/b= sd > > "The last good thing written in C was > Franz Schubert's Symphony number 9." > =A0 =A0 =A0 =A0-- Erwin Dieterich > --=20 //Monthadar Al Jaberi From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 11:03:18 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28B89106564A for ; Wed, 10 Nov 2010 11:03:18 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id A86C58FC15 for ; Wed, 10 Nov 2010 11:03:17 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PG8SN-0005zW-Ah for freebsd-fs@freebsd.org; Wed, 10 Nov 2010 12:03:11 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 10 Nov 2010 12:03:11 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 10 Nov 2010 12:03:11 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Wed, 10 Nov 2010 12:03:00 +0100 Lines: 58 Message-ID: References: <4CD84258.6090404@llnl.gov> <4CD986DC.1070401@llnl.gov> <4CD98816.1020306@llnl.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <4CD98816.1020306@llnl.gov> X-Enigmail-Version: 1.1.2 Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 11:03:18 -0000 On 11/09/10 18:42, Mike Carlson wrote: >> write# gstripe label -v -s 16384 data /dev/da2 /dev/da3 /dev/da4 >> /dev/da5 /dev/da6 /dev/da7 /dev/da8 >> write# df -h >> Filesystem Size Used Avail Capacity Mounted on >> /dev/da0s1a 1.7T 22G 1.6T 1% / >> devfs 1.0K 1.0K 0B 100% /dev >> /dev/stripe/data 126T 4.0K 116T 0% /mnt >> write# fsck /mnt >> fsck: Could not determine filesystem type >> write# fsck_ufs /mnt >> ** /dev/stripe/data (NO WRITE) >> ** Last Mounted on /mnt >> ** Phase 1 - Check Blocks and Sizes >> Segmentation fault >> So, the data appears to be okay, I wanted to run through a FSCK just to >> do it but that seg faulted. Otherwise, that data looks good. Hmm, probably it tried to allocate a gazillion internal structures to check it and didn't take no for an answer. >> Question, why did you recommend using a smaller stripe size? Is that to >> ensure a sample 1GB test file gets written across ALL disk members? Yes, it's the surest way since MAXPHYS=128 KiB / 8 = 16 KiB. Well, as far as I'm concerned this probably shows that there isn't something wrong about hardware or GEOM, though more testing, like running a couple of bonnie++ rounds on the UFS on the stripe volume for a few hours, would probably be better. Btw. what bandwidth do you get from this combination (gstripe + UFS)? > Oh, I almost forgot, here is the ZFS version of that gstripe array: > > write# zpool create test01 /dev/stripe/data > write# zpool scrub > write# zpool status > pool: test01 > state: ONLINE > scrub: scrub completed after 0h0m with 0 errors on Tue Nov 9 > 09:41:34 2010 > config: > > NAME STATE READ WRITE CKSUM > test01 ONLINE 0 0 0 > stripe/data ONLINE 0 0 0 "scrub" verifies only written data, not the whole file system space (that's why it finishes so fast), so it isn't really doing any load on the array, but I agree that it looks more and more like there really is an issue in ZFS. From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 17:05:37 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 484B31065673; Wed, 10 Nov 2010 17:05:37 +0000 (UTC) (envelope-from sarawgi.aditya@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0E96D8FC12; Wed, 10 Nov 2010 17:05:36 +0000 (UTC) Received: by pwi10 with SMTP id 10so193682pwi.13 for ; Wed, 10 Nov 2010 09:05:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=doHDc3Be3vx7yab1G/OcJncfuNovjKpeU+Flb4/uKlM=; b=JnNtUfxvt0j3a8aifyoTbRW3guaws2mGb44PW6o32bNFEsBOScpubiKvwITxnbWrxy ONhx7ae0jtDDi6GPhxbas19cd7o3eEcI4Y0foz3mMXN27s6/+GbN/nDDYVmBwXPLf5Jb hR1biIFMH6HD6InArH1lxLrwIMsnARxy9DcYQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=q/nwmuknIjTdD9DtVV6kJfIyKWKMI1pOAj3VTvQ1UCh3jPZJXBgI1GjTGFI+M1/ua9 rW17iWTlgQ5XS2IhEYt3qMTNSDRi6YziD/AW8kr9foDAI2D640MJNSmJ926Dby0ihS6I lFnAVAmbus+Mb6zypcYpvKQNg+2mXmtGSZs4s= Received: by 10.42.177.73 with SMTP id bh9mr1657654icb.162.1289408734429; Wed, 10 Nov 2010 09:05:34 -0800 (PST) Received: from earth ([183.87.49.208]) by mx.google.com with ESMTPS id gy41sm1082604ibb.23.2010.11.10.09.05.32 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 10 Nov 2010 09:05:33 -0800 (PST) Date: Wed, 10 Nov 2010 22:37:22 +0530 From: Aditya Sarawgi To: Doug Barton Message-ID: <20101110170719.GA1573@earth> References: <20100929031825.L683@besplex.bde.org> <20100929084801.M948@besplex.bde.org> <20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org> <20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org> <4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth> <4CD9E535.8000801@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CD9E535.8000801@FreeBSD.org> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: fs@freebsd.org Subject: Re: ext2fs now extremely slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 17:05:37 -0000 On Tue, Nov 09, 2010 at 04:20:05PM -0800, Doug Barton wrote: > On 11/08/2010 09:43, Aditya Sarawgi wrote: > > On Wed, Nov 03, 2010 at 05:43:26PM -0700, Doug Barton wrote: > > >> Regarding stability, sometimes (but not always) when I'm doing the above > >> listed disk-intensive things on an otherwise idle system I've had the > >> system lock up. Not panic, not reboot, just wedge. I'm running X when > >> this happens, so I'm not 100% sure that the disk activity is the > >> culprit, but it seems very suspicious. Yesterday was a very bad day, I > >> had to do 3 tries to get all the way through a buildworld/kernel, mostly > >> because the last 2 crashes resulted in my /usr/src (which is actually > >> /home/svn/head) and /usr/obj (/home/obj-9) directories getting corrupted > >> respectively. Today (running r214694) has actually been quite good, > >> although I haven't tried a buildworld yet. > >> > > > > I am not sure if this is the right use case for ext2fs > > Can you expand on that? What about it do you see as problematic? > ext2fs is not a native filesystem. It is relatively slower than UFS may have some other problems like deadlocks which are not yet discovered. May make the data inconsistent due to lack of facilities like journaling. It is only meant to make your data in linux partitions accessible. > >>> You can test Zheng's preallocation patch for ext2fs, there is a > >>> serious lack of testers for that. > >> > >> I would be happy to do that, but my reading of this thread last month > >> didn't produce a clear "try this version of the patch" neon sign. > >> Various people referred to suggestions, updates, etc. If someone could > >> provide a URL for the right patch to try, as well as a suggestion for > >> benchmarking methodology, I'll be glad to do so. > >> > > > > I have attached the patch. > > Thanks for that. I'm curious though whether this is the latest version > of the patch with the suggested improvements from earlier in this thread? > There will be only style and some comment fixes in the new patch. > > Some primitive testing like copying files, > > untaring etc and comparing with the existing ext2fs will do. If you > > are looking to do a full fledged benchmarking then I would suggest > > iozone, blogbench, dbench etc. > > Sorry, I am not a filesystem person, so if you want me to do any real > benchmarking you're going to have to give me details ... Install this > program, run this test, etc. > Install dbench, blogbench from ports or packages and follow this http://wiki.freebsd.org/SOC2010ZhengLiu > Meanwhile I finally got around to setting up my 8.1-RELEASE partition on > this same system and particularly with cvsup it's very noticeable that > ext2fs in -current is MUCH slower than in RELENG_8. I'll do some before > and after tests on -current, then I'll do the same thing on 8.1 and see > how the numbers compare. > Yes, we know there are some scope of improvements. So the BSDL ext2fs lacks preallocation which used to preallocate some blocks which improved the sequential write performance. This problem is solved by Zheng's reservation window work. The other issue like Bruce mentioned is that some of the blocks in between are skipped by the block allocator algorithm. We intend to have fixes for both of these before 9 is released. Can you also mail me privately the output of dumpe2fs Thanks Aditya Sarawgi From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 18:56:01 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 23D0A106564A; Wed, 10 Nov 2010 18:56:01 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id BA2F98FC12; Wed, 10 Nov 2010 18:56:00 +0000 (UTC) Received: by qyk30 with SMTP id 30so744085qyk.13 for ; Wed, 10 Nov 2010 10:56:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=jJXPW5Gg9GsBZGAn+awKJIbu1ydDK9tuxD64unKKGM0=; b=FU7SlrNdE5y9LiKpPOnMSHmCyuKvbAMoTOsbKZW273rdBEm2F9Wo7pHfmlU1yQO+xB qTT/oAm3pjpXTwhi4ZJHsI459AorzAPwvb7NDWWt86frBLYx9SBwc807vrNE17tY3eSa 1V8HY0KlGwwmMH+z/7Rc20x4/naFWBnpw9jzE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=lJHWAgGvYAb8is7hKUx3JemolyJu2JABok9CxR3qnTloN5HF77lbUrQr7C5K9XKBq5 SFdyP4tb9i0w80OfeBAgTuIiBMNJURHFECjWec3wmmpHgOE4kO0/CYr9EDJnUnj57p9o lJTwYPy0QkOTjZE+/Q3nMyGPuHKuaHkwVhH8U= MIME-Version: 1.0 Received: by 10.229.229.135 with SMTP id ji7mr5330367qcb.100.1289413566167; Wed, 10 Nov 2010 10:26:06 -0800 (PST) Received: by 10.229.69.135 with HTTP; Wed, 10 Nov 2010 10:26:06 -0800 (PST) In-Reply-To: <4CDA3CDD.5000404@freebsd.org> References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> <4CDA3CDD.5000404@freebsd.org> Date: Wed, 10 Nov 2010 21:26:06 +0300 Message-ID: From: Sergey Kandaurov To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org, Ivan Voras Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 18:56:01 -0000 On 10 November 2010 09:34, Andriy Gapon wrote: > on 08/11/2010 14:13 Andriy Gapon said the following: >> on 08/11/2010 13:55 Andriy Gapon said the following: >>> I reliable got this panic when all I was doing is saving an attachment = in >>> thunderbird 3 that ran in KDE 4 environment. =A0Not sure what was going= on behind >>> the scenes, but shouldn't have been anything out of the ordinary. >> >> Perhaps this is my local mistake. =A0I can't see from code and crash dum= p how NULL >> pointer is possible there. =A0So perhaps I have some ABI mismatch betwee= n kernel >> and fuse module. >> I will rebuild fuse kmod and re-test again. > > Yes, the rebuild has helped. > I wish this could be nicely automated. Hi. If I understood you correctly, then you need PORTS_MODULES set in /etc/make.conf. --=20 wbr, pluknet From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 19:08:39 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58C73106566C; Wed, 10 Nov 2010 19:08:39 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 182408FC1B; Wed, 10 Nov 2010 19:08:37 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA26702; Wed, 10 Nov 2010 21:08:35 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4CDAEDB2.4010704@freebsd.org> Date: Wed, 10 Nov 2010 21:08:34 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Sergey Kandaurov References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> <4CDA3CDD.5000404@freebsd.org> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org, Ivan Voras Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 19:08:39 -0000 on 10/11/2010 20:26 Sergey Kandaurov said the following: > Hi. > If I understood you correctly, then you need > PORTS_MODULES set in /etc/make.conf. It was a long time ago when I tried it last time, but I remember having problems with it during upgrades. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 19:42:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 348FB106566C; Wed, 10 Nov 2010 19:42:46 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 0DA538FC0C; Wed, 10 Nov 2010 19:42:44 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA27160; Wed, 10 Nov 2010 21:42:42 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4CDAF5B1.4040501@freebsd.org> Date: Wed, 10 Nov 2010 21:42:41 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Sergey Kandaurov References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> <4CDA3CDD.5000404@freebsd.org> <4CDAEDB2.4010704@freebsd.org> In-Reply-To: <4CDAEDB2.4010704@freebsd.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org, Ivan Voras Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 19:42:46 -0000 on 10/11/2010 21:08 Andriy Gapon said the following: > on 10/11/2010 20:26 Sergey Kandaurov said the following: >> Hi. >> If I understood you correctly, then you need >> PORTS_MODULES set in /etc/make.conf. > > It was a long time ago when I tried it last time, but I remember having problems > with it during upgrades. I think this is what it was/is. If a port in PORTS_MODULES has dependencies, then buildkernel would try to install those dependencies even if they are already installed. And that, obviously, would fail. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 19:49:29 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A6390106564A for ; Wed, 10 Nov 2010 19:49:29 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83]) by mx1.freebsd.org (Postfix) with ESMTP id 85BD68FC17 for ; Wed, 10 Nov 2010 19:49:29 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by smtp.llnl.gov with ESMTP; 10 Nov 2010 11:49:28 -0800 Message-ID: <4CDAF749.4000805@llnl.gov> Date: Wed, 10 Nov 2010 11:49:29 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4CD84258.6090404@llnl.gov> <4CD986DC.1070401@llnl.gov> <4CD98816.1020306@llnl.gov> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 19:49:29 -0000 On 11/10/2010 03:03 AM, Ivan Voras wrote: > On 11/09/10 18:42, Mike Carlson wrote: > >>> write# gstripe label -v -s 16384 data /dev/da2 /dev/da3 /dev/da4 >>> /dev/da5 /dev/da6 /dev/da7 /dev/da8 >>> write# df -h >>> Filesystem Size Used Avail Capacity Mounted on >>> /dev/da0s1a 1.7T 22G 1.6T 1% / >>> devfs 1.0K 1.0K 0B 100% /dev >>> /dev/stripe/data 126T 4.0K 116T 0% /mnt >>> write# fsck /mnt >>> fsck: Could not determine filesystem type >>> write# fsck_ufs /mnt >>> ** /dev/stripe/data (NO WRITE) >>> ** Last Mounted on /mnt >>> ** Phase 1 - Check Blocks and Sizes >>> Segmentation fault >>> So, the data appears to be okay, I wanted to run through a FSCK just to >>> do it but that seg faulted. Otherwise, that data looks good. > Hmm, probably it tried to allocate a gazillion internal structures to > check it and didn't take no for an answer. > >>> Question, why did you recommend using a smaller stripe size? Is that to >>> ensure a sample 1GB test file gets written across ALL disk members? > Yes, it's the surest way since MAXPHYS=128 KiB / 8 = 16 KiB. > > Well, as far as I'm concerned this probably shows that there isn't > something wrong about hardware or GEOM, though more testing, like > running a couple of bonnie++ rounds on the UFS on the stripe volume for > a few hours, would probably be better. > > Btw. what bandwidth do you get from this combination (gstripe + UFS)? > The bandwidth for geom_stripe + UFS2 was very nice: write# mount /dev/da0s1a on / (ufs, local, soft-updates) devfs on /dev (devfs, local, multilabel) filevol002 on /filevol002 (zfs, local) /dev/stripe/data on /mnt (ufs, local, soft-updates) Simple DD write: write# dd if=/dev/zero of=/mnt/zero.dat bs=1m count=5000 5000+0 records in 5000+0 records out 5242880000 bytes transferred in 13.503850 secs (388250759 bytes/sec) running bonnie++ write# bonnie++ -u 100 -s24576 -d. -n64 Using uid:100, gid:65533. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP write.llnl.gov 24G 730 99 343750 63 106157 26 1111 86 174698 26 219.2 3 Latency 11492us 149ms 227ms 70274us 66776us 766ms Version 1.96 ------Sequential Create------ --------Random Create-------- write.llnl.gov -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 64 18681 47 +++++ +++ 99516 97 26297 40 +++++ +++ 113937 96 Latency 310ms 149us 152us 68841us 144us 146us 1.96,1.96,write.llnl.gov,1,1289416723,24G,,730,99,343750,63,106157,26,1111,86,174698,26,219.2,3,64,,,,,18681,47,+++++,+++,99516,97,26297,40,+++++,+++,113937,96,11492us,149ms,227ms,70274us,66776us,766ms,310ms,149us,152us,68841us,144us,146us The system immediately and mysteriously reboot after running bonnie++ though, that doesn't seem like a good sign... I've got an iozone benchmark, gstripe + multipath + UFS vs. multipath + ZFS. I can email the gzip'd file to you, as I don't want to clutter the mailing list with file attachments. Another question, for anyone really, but will gmultipath ever have an 'active/active' model? I'm happy that I have some type of redundancy for my SAN, but it it was possible to aggregate the bandwidth of both controllers, that would be pretty cool as well. >> Oh, I almost forgot, here is the ZFS version of that gstripe array: >> >> write# zpool create test01 /dev/stripe/data >> write# zpool scrub >> write# zpool status >> pool: test01 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Tue Nov 9 >> 09:41:34 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test01 ONLINE 0 0 0 >> stripe/data ONLINE 0 0 0 > "scrub" verifies only written data, not the whole file system space > (that's why it finishes so fast), so it isn't really doing any load on > the array, but I agree that it looks more and more like there really is > an issue in ZFS. > Yeah, I ran scrub when there was around 20GB of random data. In 8.1-RELEASE, that was the way I would trigger ZFS's acknowledgment that the pool had a problem. I also dug through my logs and saw these: Nov 8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da5 offset=749207552 size=131072 Nov 8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da5 offset=749338624 size=131072 Nov 8 15:09:51 write root: ZFS: zpool I/O failure, zpool=test01 error=86 Nov 8 15:09:51 write root: ZFS: zpool I/O failure, zpool=test01 error=86 Nov 8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da3 offset=748421120 size=131072 Nov 8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da4 offset=746586112 size=131072 Nov 8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da4 offset=746455040 size=131072 Nov 8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da4 offset=746717184 size=131072 Nov 8 15:09:52 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da3 offset=748290048 size=131072 Nov 8 15:09:52 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da3 offset=748421120 size=131072 Nov 8 15:09:52 write root: ZFS: checksum mismatch, zpool=test01 path=/dev/da4 offset=746586112 size=131072 Nov 8 15:09:52 write root: ZFS: zpool I/O failure, zpool=test01 error=86 I'm inclined to believe it is an issue with ZFS. From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 20:18:44 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C5C731065697; Wed, 10 Nov 2010 20:18:44 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id D44748FC1B; Wed, 10 Nov 2010 20:18:43 +0000 (UTC) Received: by wya21 with SMTP id 21so1218791wya.13 for ; Wed, 10 Nov 2010 12:18:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=ZBBfufQUZ146StgDszFB2/UvQATpM1G14gpMtjs5ZU0=; b=KQ2N+7AMsTqTSFeGJ158mPfHrAS9TDoNNX93u9bDP4ii13Pp7yrnuqEd3xccBPJXqK ZFJFkMGO9f9BhuQWeElnDATtyBap+d9m1YwONQycvOQuEJEwO2sW9RWtF+rjpHIF/JTE fSGFiXW+53qrc8J8z2S+GpafGCG2JnV3e0yWw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=FOu7eS65HP4mY777/xXOXW9NP2NT8gIcbmE5Po+9lxopgWy2jGvd5scIZm0ROso1V8 450tR/mmHwhXjOKZWw0Tzis0OxqrAkSA4oGpO9XabysHQZtVGk3xvG+PoVFet+C8fzf6 n2S3tEHXqeh/2FslcWShDFPv3WPG0eSKW35SM= MIME-Version: 1.0 Received: by 10.216.175.18 with SMTP id y18mr8885463wel.30.1289420322665; Wed, 10 Nov 2010 12:18:42 -0800 (PST) Sender: yanegomi@gmail.com Received: by 10.216.198.27 with HTTP; Wed, 10 Nov 2010 12:18:42 -0800 (PST) In-Reply-To: <4CDAF5B1.4040501@freebsd.org> References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> <4CDA3CDD.5000404@freebsd.org> <4CDAEDB2.4010704@freebsd.org> <4CDAF5B1.4040501@freebsd.org> Date: Wed, 10 Nov 2010 12:18:42 -0800 X-Google-Sender-Auth: -JRvsaMOZ3PeCq9259iCRjUctSY Message-ID: From: Garrett Cooper To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Ivan Voras , freebsd-current@freebsd.org Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 20:18:44 -0000 On Wed, Nov 10, 2010 at 11:42 AM, Andriy Gapon wrote: > on 10/11/2010 21:08 Andriy Gapon said the following: >> on 10/11/2010 20:26 Sergey Kandaurov said the following: >>> Hi. >>> If I understood you correctly, then you need >>> PORTS_MODULES set in /etc/make.conf. >> >> It was a long time ago when I tried it last time, but I remember having = problems >> with it during upgrades. > > I think this is what it was/is. > If a port in PORTS_MODULES has dependencies, then buildkernel would try t= o install > those dependencies even if they are already installed. =A0And that, obvio= usly, would > fail. Didn't know about this knob -- cool! And FWIW, all it does is a: all install: deinstall reinstall (huh?) reinstall: deinstall reinstall (huh?) clean Seems like it should be: clean all [deinstall] install clean or: clean all install -DFORCE_PKG_REGISTER clean the first clean is just in case the PORTSWORKDIR is dirty. Thanks! -Garrett From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 20:21:39 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5100A1065673; Wed, 10 Nov 2010 20:21:39 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 5D01D8FC17; Wed, 10 Nov 2010 20:21:37 +0000 (UTC) Received: by wya21 with SMTP id 21so1221731wya.13 for ; Wed, 10 Nov 2010 12:21:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=7Rnux6gzTQowlwVRDy8ov7o4lqrdqINruw1qCpoEXyE=; b=m3E+GFhLd+O+gnQc8gXZI93mKN3dDN+/rJf0VKyCKZs7QnE1QspZy5ivsFOkG6cHTg yz7rISTZKDjsaq5BHcL1UINmWjuc/Scm+AnoilXMZPBz4Cm5ESqn2o1ccqPtY3nLK0qG AtiV4iFsR6NoVvs5GX/n8BF6WbyJt7vTsItCY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=fs28eHzdxA85xV6vwXX6+o5vYhr9r3CsPzhz0ODn7b8nC5WMwUJv/8WPrtuARWfTEK l9eZuvxPKJx21qdyoTPyt+QZfd/E27VCT8WSYpLkeNg3s23gOWriVn7u2zXV6e7heUBv SboaJcO+WiVQp8EJpMaWgBAl9KV706t7G1XlI= MIME-Version: 1.0 Received: by 10.216.7.210 with SMTP id 60mr1544969wep.30.1289420496884; Wed, 10 Nov 2010 12:21:36 -0800 (PST) Sender: yanegomi@gmail.com Received: by 10.216.198.27 with HTTP; Wed, 10 Nov 2010 12:21:36 -0800 (PST) In-Reply-To: References: <4CD7C8FC.900@icyb.net.ua> <4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org> <4CDA3CDD.5000404@freebsd.org> <4CDAEDB2.4010704@freebsd.org> <4CDAF5B1.4040501@freebsd.org> Date: Wed, 10 Nov 2010 12:21:36 -0800 X-Google-Sender-Auth: V2owgFY86dIsGIPTYZZW1uYjaJ8 Message-ID: From: Garrett Cooper To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Ivan Voras , freebsd-current@freebsd.org Subject: Re: another fuse panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 20:21:39 -0000 On Wed, Nov 10, 2010 at 12:18 PM, Garrett Cooper wrot= e: > On Wed, Nov 10, 2010 at 11:42 AM, Andriy Gapon wrote: >> on 10/11/2010 21:08 Andriy Gapon said the following: >>> on 10/11/2010 20:26 Sergey Kandaurov said the following: >>>> Hi. >>>> If I understood you correctly, then you need >>>> PORTS_MODULES set in /etc/make.conf. >>> >>> It was a long time ago when I tried it last time, but I remember having= problems >>> with it during upgrades. >> >> I think this is what it was/is. >> If a port in PORTS_MODULES has dependencies, then buildkernel would try = to install >> those dependencies even if they are already installed. =A0And that, obvi= ously, would >> fail. > > Didn't know about this knob -- cool! > > And FWIW, all it does is a: > > all > install: deinstall reinstall (huh?) > reinstall: deinstall reinstall (huh?) > clean > > Seems like it should be: > > clean > all > [deinstall] > install > clean > > or: > > clean > all > install -DFORCE_PKG_REGISTER > clean > > the first clean is just in case the PORTSWORKDIR is dirty. And FWIW an even better idea might be to align the port with the process in use, i.e. clean (i.e. NO_CLEAN, KERNFAST, etc not specified) -> [${PORTSDIR}/${PORT}] clean buildkernel -> [${PORTSDIR}/${PORT}] all installkernel -> [${PORTSDIR}/${PORT}] deinstall install *shrugs* -Garrett From owner-freebsd-fs@FreeBSD.ORG Wed Nov 10 23:00:45 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A1BEB106564A for ; Wed, 10 Nov 2010 23:00:45 +0000 (UTC) (envelope-from mark@exonetric.com) Received: from relay0.exonetric.net (relay0.exonetric.net [82.138.248.161]) by mx1.freebsd.org (Postfix) with ESMTP id 6F23D8FC0C for ; Wed, 10 Nov 2010 23:00:45 +0000 (UTC) Received: from [192.168.0.7] (unknown [78.86.207.85]) by relay0.exonetric.net (Postfix) with ESMTP id 2876E57004 for ; Wed, 10 Nov 2010 22:28:28 +0000 (GMT) From: Mark Blackman Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Wed, 10 Nov 2010 22:28:27 +0000 Message-Id: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) Subject: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2010 23:00:45 -0000 Hi, I note that when testing the pathconf(2) NO_TRUNC property on a ZFS filesystem, I get a ENOENT, "No such file or directory". I'm not sure if this qualifies as correct behaviour, but thought a learned soul on this list could enlighten me. I've attached the C snippet I used for testing. #include #include #include #include #include int main(int argc, char *argv[]){ int result; result=pathconf(argv[1], _PC_NO_TRUNC); printf("for %s: no_trunc is %d\n",argv[1],result); if (result<0) perror(NULL); 1; } - Mark From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 02:51:46 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 22ECE1065673; Thu, 11 Nov 2010 02:51:46 +0000 (UTC) (envelope-from kevlo@FreeBSD.org) Received: from ns.kevlo.org (kevlo.org [220.128.136.52]) by mx1.freebsd.org (Postfix) with ESMTP id A07AC8FC0A; Thu, 11 Nov 2010 02:51:45 +0000 (UTC) Received: from [127.0.0.1] (kevlo@kevlo.org [220.128.136.52]) by ns.kevlo.org (8.14.3/8.14.3) with ESMTP id oAB2OAAh020778; Thu, 11 Nov 2010 10:24:12 +0800 (CST) From: Kevin Lo To: delphij@FreeBSD.org Content-Type: text/plain; charset="UTF-8" Date: Thu, 11 Nov 2010 10:24:56 +0800 Message-ID: <1289442296.2128.16.camel@monet> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 02:51:46 -0000 Xin Li wrote: > (cc'ed to freebsd-fs@) > > I think it's important that someone familiar with the code review and > evaluate the current patches and commit it against -HEAD... > MSDOSFS patch (against 7.1): > http://btload.googlegroups.com/web/msdosfs.patch?gda=MzIscT8AAABs_gmy4a1S9lRiXjEy-V5OpwtI67JnIGlz0zr18tjObOtoi5oIt3BJMRGeqGBbbj-ccyFKn-rNKC-d1pM_IdV0 > NTFS patch: > http://btload.googlegroups.com/web/ntfs.patch?gda=OqsHoDwAAABs_gmy4a1S9lRiXjEy-V5O7RN7t-m4MjZ-5dQn_EvaqDVCWO9_HyYEQJyRQYPtRCL9Wm-ajmzVoAFUlE7c_fAt Hi Xin, The MSDOSFS patch looks good to me. I've been testing this patch against -HEAD for years and it seems to be working great. I hope this patch will be committed soon. Thanks! > Cheers, > - -- > Xin LI http://www.delphij.net/ > FreeBSD - The Power to Serve! Kevin From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 04:00:35 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0137106564A for ; Thu, 11 Nov 2010 04:00:35 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx23.fluidhosting.com [204.14.89.6]) by mx1.freebsd.org (Postfix) with ESMTP id 641018FC12 for ; Thu, 11 Nov 2010 04:00:35 +0000 (UTC) Received: (qmail 25363 invoked by uid 399); 11 Nov 2010 04:00:32 -0000 Received: from localhost (HELO doug-optiplex.ka9q.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTPAM; 11 Nov 2010 04:00:32 -0000 X-Originating-IP: 127.0.0.1 X-Sender: dougb@dougbarton.us Message-ID: <4CDB6A5F.2000908@FreeBSD.org> Date: Wed, 10 Nov 2010 20:00:31 -0800 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101028 Thunderbird/3.1.6 MIME-Version: 1.0 To: Aditya Sarawgi References: <20100929031825.L683@besplex.bde.org> <20100929084801.M948@besplex.bde.org> <20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org> <20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org> <4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth> <4CD9E535.8000801@FreeBSD.org> <20101110170719.GA1573@earth> In-Reply-To: <20101110170719.GA1573@earth> X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: fs@freebsd.org Subject: Re: ext2fs now extremely slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 04:00:35 -0000 On 11/10/2010 09:07, Aditya Sarawgi wrote: > On Tue, Nov 09, 2010 at 04:20:05PM -0800, Doug Barton wrote: >> Can you expand on that? What about it do you see as problematic? >> > > ext2fs is not a native filesystem. It is relatively slower than UFS Like I already said, I can live with a little bit slower. > may have some other problems like deadlocks which are not yet discovered. What better way to discover them than actual day to day use? :) > May make the data inconsistent due to lack of facilities like journaling. Well that's just plain unacceptable. Either the fs works reliably (which obviously includes safely) or it should be removed. At bare minimum if it can't reliably write data then support should be changed to read-only. > It is only meant to make your data in linux partitions accessible. Sorry, I'm not buying that. Don't get me wrong, I really appreciate the help you're providing, and I don't want to get you in the middle of my own personal vendetta, but we can't tolerate this perspective. We either need to support something, or not support it. >>>>> You can test Zheng's preallocation patch for ext2fs, there is a >>>>> serious lack of testers for that. >>>> >>>> I would be happy to do that, but my reading of this thread last month >>>> didn't produce a clear "try this version of the patch" neon sign. >>>> Various people referred to suggestions, updates, etc. If someone could >>>> provide a URL for the right patch to try, as well as a suggestion for >>>> benchmarking methodology, I'll be glad to do so. >>>> >>> >>> I have attached the patch. >> >> Thanks for that. I'm curious though whether this is the latest version >> of the patch with the suggested improvements from earlier in this thread? >> > > There will be only style and some comment fixes in the new patch. Ok, thanks. > Yes, we know there are some scope of improvements. So the BSDL ext2fs lacks > preallocation which used to preallocate some blocks which improved the sequential > write performance. This problem is solved by Zheng's reservation window work. > The other issue like Bruce mentioned is that some of the blocks in between are > skipped by the block allocator algorithm. We intend to have fixes for both of > these before 9 is released. Ok, but time is running out. :) Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 05:49:17 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B0B51065670; Thu, 11 Nov 2010 05:49:17 +0000 (UTC) (envelope-from buganini@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2A8198FC12; Thu, 11 Nov 2010 05:49:16 +0000 (UTC) Received: by iwn39 with SMTP id 39so1729243iwn.13 for ; Wed, 10 Nov 2010 21:49:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=YpXW1Py7o+5PTRqUc7qzg5uPksj9v0CxH/tCHbtFR28=; b=gwjya343sweUBebDqrGGldmabUtnN15Obr+/efI3ak2MBUUgvzCixDZJDi1AXHhQOU Iqo8LKIbNq23zpd7M4Dm2fQs75X5E1W4E7QxIazZMysk+VUwkm/24XtsAyhsosXN/4iO AbxaoKLKZcgEKBV6Yc9VJ5kxQJzv6AB9O3Ny8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=L/sF2tfrGe+z5uzoqd1/z2mG1OboKK8s67ukMvdD7o9LLpjWCK/N2uTRS3wkn920q3 NLGQ8L9h9qmb8734PXRv7vgY3Lw7cprg9zBh/hzmhL8WWVcbxnGB3i09yvc8vwmYgF0L il44czTOHIIhTpfATluc6zHgrPv6IuCLjEWH4= MIME-Version: 1.0 Received: by 10.231.13.136 with SMTP id c8mr220627iba.19.1289452852095; Wed, 10 Nov 2010 21:20:52 -0800 (PST) Received: by 10.231.32.194 with HTTP; Wed, 10 Nov 2010 21:20:52 -0800 (PST) In-Reply-To: <1289442296.2128.16.camel@monet> References: <1289442296.2128.16.camel@monet> Date: Thu, 11 Nov 2010 13:20:52 +0800 Message-ID: From: Buganini To: Kevin Lo Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, delphij@freebsd.org Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 05:49:17 -0000 I'm using these two patches on CURRENT http://security-hole.info/~buganini/patches/kiconv_msdosfs/ From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 06:09:23 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 00A3A106566B for ; Thu, 11 Nov 2010 06:09:23 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx23.fluidhosting.com [204.14.89.6]) by mx1.freebsd.org (Postfix) with ESMTP id 85CB48FC0A for ; Thu, 11 Nov 2010 06:09:20 +0000 (UTC) Received: (qmail 20813 invoked by uid 399); 11 Nov 2010 06:09:20 -0000 Received: from localhost (HELO doug-optiplex.ka9q.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTPAM; 11 Nov 2010 06:09:20 -0000 X-Originating-IP: 127.0.0.1 X-Sender: dougb@dougbarton.us Message-ID: <4CDB888E.6030005@FreeBSD.org> Date: Wed, 10 Nov 2010 22:09:18 -0800 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101028 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Minor problem re-mounting ext2fs system X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 06:09:23 -0000 Even after a clean shutdown I get this error at nearly every reboot: kernel: last write time is in the future. kernel: (by less than a day, probably due to the hardware clock being incorrectly set). kernel: FIXED. It does get fixed, but it requires the fsck program from e2fsprogs to do it. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 10:06:37 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 26BE2106567A for ; Thu, 11 Nov 2010 10:06:37 +0000 (UTC) (envelope-from rs@bytecamp.net) Received: from mail.bytecamp.net (mail.bytecamp.net [212.204.60.9]) by mx1.freebsd.org (Postfix) with ESMTP id 5DA938FC13 for ; Thu, 11 Nov 2010 10:06:35 +0000 (UTC) Received: (qmail 95076 invoked by uid 89); 11 Nov 2010 10:39:55 +0100 Received: from stella.bytecamp.net (HELO ?212.204.60.37?) (rs%bytecamp.net@212.204.60.37) by mail.bytecamp.net with CAMELLIA256-SHA encrypted SMTP; 11 Nov 2010 10:39:55 +0100 Message-ID: <4CDBB9EB.8010908@bytecamp.net> Date: Thu, 11 Nov 2010 10:39:55 +0100 From: Robert Schulze Organization: bytecamp GmbH User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.15) Gecko/20101027 Thunderbird/3.0.10 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: nfsd stuck in *rc_lock state X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 10:06:37 -0000 Hello everybody, we are running several 8.1 NFS Clients connected to a 8.0 NFS Server. The system ran fine, but today the NFS-Server locked up. nfsd ate 100% of CPU, with one {nfsd: service} thread in state "*rc_lock". I tried to find others with the same issue and finally ended up at a patch from Rick: http://people.freebsd.org/~rmacklem/freebsd8.1-patches/replay.patch may I apply this patch to a 8.0 system to fix this issue, or are there any other patches/commits which affect this? with kind regards, Robert Schulze From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 10:19:27 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 62EA5106564A for ; Thu, 11 Nov 2010 10:19:27 +0000 (UTC) (envelope-from gljennjohn@googlemail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id E27728FC0A for ; Thu, 11 Nov 2010 10:19:26 +0000 (UTC) Received: by fxm19 with SMTP id 19so1160724fxm.13 for ; Thu, 11 Nov 2010 02:19:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:in-reply-to:references:reply-to:x-mailer:mime-version :content-type:content-transfer-encoding; bh=UW+revAW1FFfXwu6BGHdliMXC85Rvb1K5URo5T3/CJk=; b=RqunH0l5ZZr2t3wK+tueHR/yc2Wj0ZwF+Owro4+xiL5zQpb6GWdbItsimYkUKX5sTM LfbTc24UJLDXIhpQsio4gaoj/kG8b4VLgSvn7xGnG1IzqM+mFHYtJFWh9mIJMSSq4l2U sWXILZrz/pHZIF4LhIQoI+1p9KBxZOAxmfhqM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :x-mailer:mime-version:content-type:content-transfer-encoding; b=T1r/tjoeMu3itCSNeUK4/egKrkBTqssVUQJfBVkyFs9uPpW66Luyu/niVDDbOlGC9x uLsqbt62hDx04M9shCeb59En2lzzArEx1P6HKaE/2aQSgxKEmin0LQIn3OxJEvxYNIox gdCck53Y+kHTHZUNQGysR2Nf4/1VsgQFxHMUU= Received: by 10.223.106.210 with SMTP id y18mr110008fao.108.1289470765068; Thu, 11 Nov 2010 02:19:25 -0800 (PST) Received: from ernst.jennejohn.org (p578E1734.dip.t-dialin.net [87.142.23.52]) by mx.google.com with ESMTPS id o7sm836079fal.3.2010.11.11.02.19.23 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 11 Nov 2010 02:19:24 -0800 (PST) Date: Thu, 11 Nov 2010 11:19:22 +0100 From: Gary Jennejohn To: delphij@FreeBSD.org Message-ID: <20101111111922.7fe8ab19@ernst.jennejohn.org> In-Reply-To: <1289442296.2128.16.camel@monet> References: <1289442296.2128.16.camel@monet> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.18.7; amd64-portbld-freebsd9.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gljennjohn@googlemail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 10:19:27 -0000 Xin Li wrote: > > (cc'ed to freebsd-fs@) > > > > I think it's important that someone familiar with the code review and > > evaluate the current patches and commit it against -HEAD... > > > MSDOSFS patch (against 7.1): > > http://btload.googlegroups.com/web/msdosfs.patch?gda=MzIscT8AAABs_gmy4a1S9lRiXjEy-V5OpwtI67JnIGlz0zr18tjObOtoi5oIt3BJMRGeqGBbbj-ccyFKn-rNKC-d1pM_IdV0 > > NTFS patch: > > http://btload.googlegroups.com/web/ntfs.patch?gda=OqsHoDwAAABs_gmy4a1S9lRiXjEy-V5O7RN7t-m4MjZ-5dQn_EvaqDVCWO9_HyYEQJyRQYPtRCL9Wm-ajmzVoAFUlE7c_fAt > Just FYI. The NTFS patch is no longer found and the page is reported as no longer existing. -- Gary Jennejohn From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 12:06:36 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03356106564A for ; Thu, 11 Nov 2010 12:06:36 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [193.34.186.230]) by mx1.freebsd.org (Postfix) with ESMTP id 920AD8FC14 for ; Thu, 11 Nov 2010 12:06:34 +0000 (UTC) Received: from higson.cam.lispworks.com (IDENT:U2FsdGVkX1+e19cTDG6C3wTBNk2namDyp3/fFqQ2vxY@higson [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.3/8.14.3) with ESMTP id oABC6VkA002088; Thu, 11 Nov 2010 12:06:31 GMT (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com by higson.cam.lispworks.com (8.13.1) id oABC6VGe027666; Thu, 11 Nov 2010 12:06:31 GMT Received: (from martin@localhost) by higson.cam.lispworks.com (8.13.1/8.13.1/Submit) id oABC6VYG027663; Thu, 11 Nov 2010 12:06:31 GMT Date: Thu, 11 Nov 2010 12:06:31 GMT Message-Id: <201011111206.oABC6VYG027663@higson.cam.lispworks.com> From: Martin Simmons To: mark@exonetric.com In-reply-to: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> (message from Mark Blackman on Wed, 10 Nov 2010 22:28:27 +0000) References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 12:06:36 -0000 >>>>> On Wed, 10 Nov 2010 22:28:27 +0000, Mark Blackman said: > > I note that when testing the pathconf(2) NO_TRUNC property > on a ZFS filesystem, I get a ENOENT, "No such file or directory". > > I'm not sure if this qualifies as correct behaviour, but thought > a learned soul on this list could enlighten me. > > I've attached the C snippet I used for testing. > > #include > #include > #include > #include > #include > > int main(int argc, char *argv[]){ > int result; > > result=pathconf(argv[1], _PC_NO_TRUNC); > printf("for %s: no_trunc is %d\n",argv[1],result); > if (result<0) > perror(NULL); > 1; > } Your call to printf is clobbering the real errno, which is EINVAL. That is an allowed value according to the pathconf man page: [EINVAL] The implementation does not support an association of the variable name with the associated file. So it is correct, but maybe not useful. __Martin From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 12:10:53 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5131106564A for ; Thu, 11 Nov 2010 12:10:53 +0000 (UTC) (envelope-from mark@exonetric.com) Received: from relay0.exonetric.net (relay0.exonetric.net [82.138.248.161]) by mx1.freebsd.org (Postfix) with ESMTP id 9EC1A8FC15 for ; Thu, 11 Nov 2010 12:10:53 +0000 (UTC) Received: from [192.168.111.107] (unknown [62.244.179.66]) by relay0.exonetric.net (Postfix) with ESMTP id 990BF57228; Thu, 11 Nov 2010 12:10:52 +0000 (GMT) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Mark Blackman In-Reply-To: <201011111206.oABC6VYG027663@higson.cam.lispworks.com> Date: Thu, 11 Nov 2010 12:10:36 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com> References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> <201011111206.oABC6VYG027663@higson.cam.lispworks.com> To: Martin Simmons X-Mailer: Apple Mail (2.1081) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 12:10:53 -0000 On 11 Nov 2010, at 12:06, Martin Simmons wrote: >>>>>> On Wed, 10 Nov 2010 22:28:27 +0000, Mark Blackman said: >>=20 >> #include >> #include >> #include >> #include >> #include >>=20 >> int main(int argc, char *argv[]){ >> int result; >>=20 >> result=3Dpathconf(argv[1], _PC_NO_TRUNC); >> printf("for %s: no_trunc is %d\n",argv[1],result); >> if (result<0) >> perror(NULL); >> 1; >> } >=20 > Your call to printf is clobbering the real errno, which is EINVAL. =20 Doh! thanks for pointing that out. :) > That is an > allowed value according to the pathconf man page: >=20 > [EINVAL] The implementation does not support an = association of > the variable name with the associated file. >=20 > So it is correct, but maybe not useful. hmm. this is popping up in the context of building perl 5.12 on a = zfs-only filesystem. One of the POSIX::* tests fails because of the above. - Mark From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 12:25:04 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7C46B106564A for ; Thu, 11 Nov 2010 12:25:04 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id CD3FD8FC1D for ; Thu, 11 Nov 2010 12:25:03 +0000 (UTC) Received: by eyb7 with SMTP id 7so1020245eyb.13 for ; Thu, 11 Nov 2010 04:25:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=Y0K/XQGdHdjpHnZ8NZaSyIOdOcHMMP1v7v4eFcrgfHU=; b=V0EEWOSjCv3YGY0Vlz4q5tUIw3BIFW3j+C97eYt7qxBASOwhO6hGh7JnI0RPw5uYw/ xBKOmqabQMYQ03zz75XCtM2TGThcFXwzNJEs50tb6KAVzY7wVyc0SC6myEW7LBI0Twwh vRzrIL8zIIdlUhbYz3TSwP0putztjyiP7SEG8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=bPA5ac9Iip3ioTDRKxPZiz2eizM/D0201lmXn7ep+RSIJxaP4oslTuaNyIul/2tWza 2//de9/yzQnjqoQMP9PHktr9SZTNsxlKBQWaaTMaYk5hq7DOCNL8rXPXYL+JUGPdzBO3 V5zJ5bNoCwLEQLYW5aYY0miFU2ETGnHdnf+BQ= Received: by 10.213.33.78 with SMTP id g14mr1654254ebd.3.1289478301577; Thu, 11 Nov 2010 04:25:01 -0800 (PST) Received: from localhost ([212.98.186.134]) by mx.google.com with ESMTPS id q58sm1867625eeh.3.2010.11.11.04.24.59 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 11 Nov 2010 04:25:00 -0800 (PST) Date: Thu, 11 Nov 2010 14:24:55 +0200 From: Gleb Kurtsou To: Buganini Message-ID: <20101111122455.GA2098@tops> References: <1289442296.2128.16.camel@monet> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, Kevin Lo , delphij@freebsd.org Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 12:25:04 -0000 On (11/11/2010 13:20), Buganini wrote: > I'm using these two patches on CURRENT > http://security-hole.info/~buganini/patches/kiconv_msdosfs/ Patch looks worth committing, I've once started working on very similar solution but never had time to finish it. What do think about importing lower-upper case characters unicode ranges and extending kiconv to remove locale option (-L) from msdosfs? Thanks, Gleb. From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 13:22:59 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0FCA7106566B for ; Thu, 11 Nov 2010 13:22:59 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id A228A8FC0C for ; Thu, 11 Nov 2010 13:22:58 +0000 (UTC) Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au (c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oABDMoZN016053 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 12 Nov 2010 00:22:51 +1100 Date: Fri, 12 Nov 2010 00:22:50 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Martin Simmons In-Reply-To: <201011111206.oABC6VYG027663@higson.cam.lispworks.com> Message-ID: <20101112000011.A1372@besplex.bde.org> References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> <201011111206.oABC6VYG027663@higson.cam.lispworks.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 13:22:59 -0000 On Thu, 11 Nov 2010, Martin Simmons wrote: >>>>>> On Wed, 10 Nov 2010 22:28:27 +0000, Mark Blackman said: >> >> I note that when testing the pathconf(2) NO_TRUNC property >> on a ZFS filesystem, I get a ENOENT, "No such file or directory". >> >> I'm not sure if this qualifies as correct behaviour, but thought >> a learned soul on this list could enlighten me. >> >> I've attached the C snippet I used for testing. >> >> #include >> #include >> #include >> #include >> #include >> >> int main(int argc, char *argv[]){ >> int result; >> >> result=pathconf(argv[1], _PC_NO_TRUNC); >> printf("for %s: no_trunc is %d\n",argv[1],result); >> if (result<0) >> perror(NULL); >> 1; >> } > > Your call to printf is clobbering the real errno, which is EINVAL. That is an > allowed value according to the pathconf man page: > > [EINVAL] The implementation does not support an association of > the variable name with the associated file. > > So it is correct, but maybe not useful. I think POSIX requires (in cross-references that are hard to read in ASCII versions) _PC_NO_TRUNC to be supported for directories (only). POSIX clearly says that whether _PC_NO_TRUNC is supported for non-directories is implementation-defined. For some feature test variables, it may be necessary to test both the pathconf variable and the compile-time variable (_POSIX_NO_TRUNC here). Under FreeBSD, _POSIX_NO_TRUNC is 1, which means that this feature applies to all files, but this is just wrong since individual file systems can and do return 0 for _PC_NO_TRUNC (msdosfs is one example -- it has too allow truncation since file names are often too long for 8.3 format, and truncating them is what msdos would do). So applications shouldn't trust _POSIX_NOTRUNC. Fortunately, it is easiest to never use it (except in programs that test for bugs like the FreeBSD one). It is not useful, since even if it is correct then in most cases it will be 0 (meaning that whether the feature applies is fs-dependent, so that _PC_NO_TRUNC must be used). It is easiest to always just use _PC_NO_TRUNC and not use the ifdef tangle involving _POSIX_NO_TRUNC that is needed to sometimnes avoid using _PC_NO_TRUNC> msdosfs may also be wrong in returning 0 for _PC_NO_TRUNC in both the 8.3 and longnames cases. I don't know what msdos does in the longnames case, but with long names there is no need to truncate. Bruce From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 13:54:41 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 74444106564A for ; Thu, 11 Nov 2010 13:54:41 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id 10A118FC0C for ; Thu, 11 Nov 2010 13:54:40 +0000 (UTC) Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au (c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oABDsYZg013200 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 12 Nov 2010 00:54:35 +1100 Date: Fri, 12 Nov 2010 00:54:34 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mark Blackman In-Reply-To: <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com> Message-ID: <20101112002522.V1372@besplex.bde.org> References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> <201011111206.oABC6VYG027663@higson.cam.lispworks.com> <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 13:54:41 -0000 On Thu, 11 Nov 2010, Mark Blackman wrote: > On 11 Nov 2010, at 12:06, Martin Simmons wrote: >> Your call to printf is clobbering the real errno, which is EINVAL. > > Doh! thanks for pointing that out. :) > >> That is an >> allowed value according to the pathconf man page: >> >> [EINVAL] The implementation does not support an association of >> the variable name with the associated file. >> >> So it is correct, but maybe not useful. > > hmm. this is popping up in the context of building perl 5.12 on a zfs-only > filesystem. One of the POSIX::* tests fails because of the above. zfs_vnops.c:zfs_pathconf() is missing _PC_NO_TRUNC, so it seems to be just broken (it returns EOPNOTSUPP for cases not in the switch, so there seems to be no way for another level to support _PC_NO_TRUNC). It apparently depends on another layer providing defaults. Other basic things missing in it: _PC_NAME_MAX _PC_CHOWN_RESTRICTED _PC_PIPE_BUF [several other things that are in the switch statement in vop_stdpathconf(), but which are nonsense there since they only apply to device files and should depend on the file anyway, and which don't apply to zfs or any normal file system since device files on normal file systems are no longer supported] _PC_PIPE_BUF is not quite like the features that onluy apply to device files. It aplies to named pipes, and since there is no defaulting of _PC_* in FreeBSD, all file systems that support named pipes must support it in their pathconf vop although it has nothing to do with file systems. Fortunately, pathconf() is never used except by naive programs like perl :-). _PC_NAME_MAX is used by patch(1) in FreeBSD, but patch(1) also has an ifdef tangle using _POSIX_NAME_MAX and other messes which I think allows patch to work accidentally if zfs returns EOPNOTSUPP: from backupfile.c: % void % addext(char *filename, char *ext, int e) % { % char *s = (char *)(uintptr_t)(const void *)basename (filename); % int slen = strlen (s), extlen = strlen (ext); % long slen_max = -1; % % #if HAVE_PATHCONF && defined (_PC_NAME_MAX) % #ifndef _POSIX_NAME_MAX % #define _POSIX_NAME_MAX 14 % #endif _POSIX_NAME_MAX is always 14 on POSIX systems, so this ifdef is no help. % if (slen + extlen <= _POSIX_NAME_MAX) % /* The file name is so short there's no need to call pathconf. */ % slen_max = _POSIX_NAME_MAX; % else if (s == filename) % slen_max = pathconf (".", _PC_NAME_MAX); I think we get here and pathconf() fails for names of length just 15 or greater. % else % { % char c = *s; % *s = 0; % slen_max = pathconf (filename, _PC_NAME_MAX); % *s = c; % } % #endif % if (slen_max == -1) { % #ifdef HAVE_LONG_FILE_NAMES % slen_max = 255; We get here on error (since although FreeBSD only has long file names on some file systems, patch is misconfigured, possibly by configuring it on a normal file system that has long names, so HAVE_LONG_FILE_NAMES is set unconditionally in the hard-configured config.h), so the max is essentially hard-coded as 255 if pathconf() fails. % #else % slen_max = 14; % #endif % } Bruce From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 14:17:58 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7241D1065693 for ; Thu, 11 Nov 2010 14:17:58 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id 0DBCC8FC16 for ; Thu, 11 Nov 2010 14:17:57 +0000 (UTC) Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au (c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oABEHpoa020457 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 12 Nov 2010 01:17:52 +1100 Date: Fri, 12 Nov 2010 01:17:51 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20101112002522.V1372@besplex.bde.org> Message-ID: <20101112005719.O1598@besplex.bde.org> References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> <201011111206.oABC6VYG027663@higson.cam.lispworks.com> <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com> <20101112002522.V1372@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 14:17:58 -0000 On Fri, 12 Nov 2010, Bruce Evans wrote: > On Thu, 11 Nov 2010, Mark Blackman wrote: > >> On 11 Nov 2010, at 12:06, Martin Simmons wrote: >>> Your call to printf is clobbering the real errno, which is EINVAL. >> >> Doh! thanks for pointing that out. :) >> >>> That is an >>> allowed value according to the pathconf man page: >>> >>> [EINVAL] The implementation does not support an association >>> of >>> the variable name with the associated file. >>> >>> So it is correct, but maybe not useful. >> >> hmm. this is popping up in the context of building perl 5.12 on a zfs-only >> filesystem. One of the POSIX::* tests fails because of the above. > > zfs_vnops.c:zfs_pathconf() is missing _PC_NO_TRUNC, so it seems to be just > broken (it returns EOPNOTSUPP for cases not in the switch, so there seems > to be no way for another level to support _PC_NO_TRUNC). It apparently > depends on another layer providing defaults. > > Other basic things missing in it: > _PC_NAME_MAX > _PC_CHOWN_RESTRICTED > _PC_PIPE_BUF > [several other things that are in the switch statement in vop_stdpathconf(), Oops, more careful grepping shows that zfs uses its own layer (zfs_freebsd_pathconf()) to provide defaults by calling vop_stdpathconf() if zfs_pathconf() fails with error EOPNOTSUPP, so it should never fail for _PC_NO_TRUNC or the other basic ones). But there seem to be problems somewhere. EOPNOTSUPP is not a possible error for pathconf(). EINVAL must be returned for unsupported feature tests. ffs handles all the cases directly, except it is more careful for fifos -- it calls fifo_pathconf() for these, where zfs seems to succeed for a lot of features that shouldn't apply to fifos, and then falls back to vop_stdpathconf() which succeeds for even more features where it shouldn't. fifo_pathconf() only succeeds for _PC_LINK_MAX, _PC_PIPE_BUF and _PC_CHOWN_RESTRICTED. Its setting of _PC_LINK_MAX is wrong, since the value of this is fs-dependent. _PC_PIPE_BUF is of course pipe+fifo-dependent so its setting belongs here and somewhere for pipes (I can't see where it is supported for fpathconf() on pipes). _PC_CHOWN_RESTRICTED also shouldn't be set here, but perhaps fifos can know a system-wide setting and repeat it, as other file systems do. Correct layering would probably result in vop_stdpathconf() not existing. It is currently more or less correct only for devfs, and is only used by zfs, coda, devfs, fdescfs and portalfs. Bruce From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 14:32:28 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD3991065670 for ; Thu, 11 Nov 2010 14:32:28 +0000 (UTC) (envelope-from mark@exonetric.com) Received: from relay0.exonetric.net (relay0.exonetric.net [82.138.248.161]) by mx1.freebsd.org (Postfix) with ESMTP id A503A8FC17 for ; Thu, 11 Nov 2010 14:32:28 +0000 (UTC) Received: from [192.168.111.107] (unknown [62.244.179.66]) by relay0.exonetric.net (Postfix) with ESMTP id 79F3A57228; Thu, 11 Nov 2010 14:32:27 +0000 (GMT) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Mark Blackman In-Reply-To: <20101112005719.O1598@besplex.bde.org> Date: Thu, 11 Nov 2010 14:32:25 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <43792CD8-6127-4402-8A4F-5DFD894F7256@exonetric.com> References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> <201011111206.oABC6VYG027663@higson.cam.lispworks.com> <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com> <20101112002522.V1372@besplex.bde.org> <20101112005719.O1598@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.1081) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and pathconf(_PC_NO_TRUNC) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 14:32:28 -0000 On 11 Nov 2010, at 14:17, Bruce Evans wrote: > where zfs seems to succeed for a lot > of features that shouldn't apply to fifos, and then falls back to > vop_stdpathconf() which succeeds for even more features where it = shouldn't. > fifo_pathconf() only succeeds for _PC_LINK_MAX, _PC_PIPE_BUF and > _PC_CHOWN_RESTRICTED. Its setting of _PC_LINK_MAX is wrong, since the > value of this is fs-dependent. _PC_PIPE_BUF is of course = pipe+fifo-dependent > so its setting belongs here and somewhere for pipes (I can't see where = it > is supported for fpathconf() on pipes). _PC_CHOWN_RESTRICTED also = shouldn't > be set here, but perhaps fifos can know a system-wide setting and = repeat it, > as other file systems do. >=20 > Correct layering would probably result in vop_stdpathconf() not = existing. > It is currently more or less correct only for devfs, and is only used = by > zfs, coda, devfs, fdescfs and portalfs. >=20 > Bruce Ok, I'll file that bug then. I wonder how Solaris handles pathconf, but = that's another question. Cheers, Mark= From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 20:23:11 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EACC0106566C; Thu, 11 Nov 2010 20:23:11 +0000 (UTC) (envelope-from lists@mawer.org) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 95AE58FC19; Thu, 11 Nov 2010 20:23:11 +0000 (UTC) Received: by vws20 with SMTP id 20so623653vws.13 for ; Thu, 11 Nov 2010 12:23:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.236.196 with SMTP id kl4mr1110283qcb.109.1289505467057; Thu, 11 Nov 2010 11:57:47 -0800 (PST) Received: by 10.229.91.66 with HTTP; Thu, 11 Nov 2010 11:57:46 -0800 (PST) In-Reply-To: References: <201011081004.59640.jhb@freebsd.org> <20101108151028.GI2392@deviant.kiev.zoral.com.ua> Date: Fri, 12 Nov 2010 06:57:46 +1100 Message-ID: From: Antony Mawer To: Ivan Voras Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: The state of Giant lock in the file systems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 20:23:12 -0000 On Tue, Nov 9, 2010 at 2:28 AM, Ivan Voras wrote: > On 8 November 2010 16:10, Kostik Belousov wrote: > >> I already claimed several times that I will remove VFS_LOCK_GIANT >> after smbfs is locked. Patch for removal is sitting in my repository >> for almost a year. > > Ok, I've made a little table here: > > http://wiki.freebsd.org/MPSAFE_VFS FYI - NWFS is still functional in 8.x (there are some minor but annoying bugs, e.g. the root node path resolution occasionally trips over itself causing the mount point to become inaccessible, but that's been there since 4.x days), and I am happy to test any locking changes to it. >From memory NWFS and SMBFS share similar locking strategies so what gets done to one typically gets applied to the other. This got hit early in the 6.0 beta series where SMBFS had VFS locking changes which hadn't been applied to NWFS. On that occasion we were able to work with truckman@ to isolate the problem and get the right locking changes made in time for 6.0's release. On the SMBFS front, it's largely unmaintained as well (sadly) -- there are patches to add Unicode support to SMBFS which have been floating around since 2005, but so far they have (to my knowledge) never seen any reviews: http://people.freebsd.org/~imura/kiconv/ SMBFS and NWFS both share a lot of similar designs in terms of their FreeBSD implementations due to being both implemented by the same developer (bp@). --Antony From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 20:28:14 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 37FFC106566C; Thu, 11 Nov 2010 20:28:14 +0000 (UTC) (envelope-from lists@mawer.org) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id BB9038FC16; Thu, 11 Nov 2010 20:28:13 +0000 (UTC) Received: by qwj8 with SMTP id 8so1501535qwj.13 for ; Thu, 11 Nov 2010 12:28:13 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.82.85 with SMTP id a21mr1121489qcl.71.1289505610127; Thu, 11 Nov 2010 12:00:10 -0800 (PST) Received: by 10.229.91.66 with HTTP; Thu, 11 Nov 2010 12:00:09 -0800 (PST) In-Reply-To: <20101111122455.GA2098@tops> References: <1289442296.2128.16.camel@monet> <20101111122455.GA2098@tops> Date: Fri, 12 Nov 2010 07:00:09 +1100 Message-ID: From: Antony Mawer To: Gleb Kurtsou Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, Kevin Lo , delphij@freebsd.org Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 20:28:14 -0000 On Thu, Nov 11, 2010 at 11:24 PM, Gleb Kurtsou wrote: > On (11/11/2010 13:20), Buganini wrote: >> I'm using these two patches on CURRENT >> http://security-hole.info/~buganini/patches/kiconv_msdosfs/ > > Patch looks worth committing, I've once started working on very similar > solution but never had time to finish it. > > What do think about importing lower-upper case characters unicode ranges > and extending kiconv to remove locale option (-L) from msdosfs? While we're on the topic of looking at filesystem's Unicode support, would anyone with the appropriate knowledge have a chance to look at these patches? http://people.freebsd.org/~imura/kiconv/ It would make smbfs against modern systems so much more usable... --Antony From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 20:30:16 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C7913106566B for ; Thu, 11 Nov 2010 20:30:16 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B71A48FC0A for ; Thu, 11 Nov 2010 20:30:16 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oABKUFH5059845 for ; Thu, 11 Nov 2010 20:30:15 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oABKUFoF059840; Thu, 11 Nov 2010 20:30:15 GMT (envelope-from gnats) Date: Thu, 11 Nov 2010 20:30:15 GMT Message-Id: <201011112030.oABKUFoF059840@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Antony Mawer Cc: Subject: Re: kern/151845: [smbfs] [patch] smbfs should be upgraded to support Unicode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Antony Mawer List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 20:30:16 -0000 The following reply was made to PR kern/151845; it has been noted by GNATS. From: Antony Mawer To: bug-followup@FreeBSD.org, m.meelis@easybow.com Cc: Subject: Re: kern/151845: [smbfs] [patch] smbfs should be upgraded to support Unicode Date: Fri, 12 Nov 2010 06:55:35 +1100 There were some patches floating around to add Unicode support to smbfs as long as 5 years ago, apparently inspired b work done on Mac OS X. These are still available here: http://people.freebsd.org/~imura/kiconv/ The smbfs code hasn't changed too much in that time (it's pretty much unmaintained), so I don't think it would be too much work to dust them off and get them to apply against 8.x or -CURRENT. If someone out there with some knowledge of this area were able to spare a few hours to look at this would be a huge step in bringing SMBFS up to a modern usable level - at the moment it is largely useless as soon as you hit files with non-ASCII characters in it. From owner-freebsd-fs@FreeBSD.ORG Thu Nov 11 22:54:44 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6DD6A106564A for ; Thu, 11 Nov 2010 22:54:44 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 2B4EF8FC0A for ; Thu, 11 Nov 2010 22:54:43 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEANcC3EyDaFvO/2dsb2JhbACDO6ABsRCQb4EigVKBY3MEhFqFfoUP X-IronPort-AV: E=Sophos;i="4.59,185,1288584000"; d="scan'208";a="100457969" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 11 Nov 2010 17:54:42 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 06E9FB3EB0; Thu, 11 Nov 2010 17:54:43 -0500 (EST) Date: Thu, 11 Nov 2010 17:54:43 -0500 (EST) From: Rick Macklem To: Robert Schulze Message-ID: <865951441.196904.1289516082967.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <4CDBB9EB.8010908@bytecamp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [12.16.49.138] X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - IE8 (Win)/6.0.7_GA_2473.RHEL4_64) Cc: freebsd-fs@freebsd.org Subject: Re: nfsd stuck in *rc_lock state X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2010 22:54:44 -0000 > Hello everybody, > > we are running several 8.1 NFS Clients connected to a 8.0 NFS Server. > The system ran fine, but today the NFS-Server locked up. > > nfsd ate 100% of CPU, with one {nfsd: service} thread in state > "*rc_lock". I tried to find others with the same issue and finally > ended > up at a patch from Rick: > > http://people.freebsd.org/~rmacklem/freebsd8.1-patches/replay.patch > > may I apply this patch to a 8.0 system to fix this issue, or are there > any other patches/commits which affect this? > That patch is "self contained", so I think it should be fine to apply it to an 8.0 server. You might also want http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-svc-mbufleak.patch which plugged an mbuf leak in the regular FreeBSD8.0 server. Good luck with it, rick From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 07:44:04 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 63E1C106564A; Fri, 12 Nov 2010 07:44:04 +0000 (UTC) (envelope-from kevlo@FreeBSD.org) Received: from ns.kevlo.org (kevlo.org [220.128.136.52]) by mx1.freebsd.org (Postfix) with ESMTP id 0273F8FC12; Fri, 12 Nov 2010 07:44:03 +0000 (UTC) Received: from [127.0.0.1] (kevlo@kevlo.org [220.128.136.52]) by ns.kevlo.org (8.14.3/8.14.3) with ESMTP id oAC7hFsS018602; Fri, 12 Nov 2010 15:43:16 +0800 (CST) From: Kevin Lo To: gljennjohn@googlemail.com In-Reply-To: <20101111111922.7fe8ab19@ernst.jennejohn.org> References: <1289442296.2128.16.camel@monet> <20101111111922.7fe8ab19@ernst.jennejohn.org> Content-Type: text/plain; charset="UTF-8" Date: Fri, 12 Nov 2010 15:44:02 +0800 Message-ID: <1289547842.6426.6.camel@monet> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, delphij@FreeBSD.org Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 07:44:04 -0000 Gary Jennejohn wrote: > Xin Li wrote: > > > (cc'ed to freebsd-fs@) > > > > > > I think it's important that someone familiar with the code review and > > > evaluate the current patches and commit it against -HEAD... > > > > > MSDOSFS patch (against 7.1): > > > http://btload.googlegroups.com/web/msdosfs.patch?gda=MzIscT8AAABs_gmy4a1S9lRiXjEy-V5OpwtI67JnIGlz0zr18tjObOtoi5oIt3BJMRGeqGBbbj-ccyFKn-rNKC-d1pM_IdV0 > > > NTFS patch: > > > http://btload.googlegroups.com/web/ntfs.patch?gda=OqsHoDwAAABs_gmy4a1S9lRiXjEy-V5O7RN7t-m4MjZ-5dQn_EvaqDVCWO9_HyYEQJyRQYPtRCL9Wm-ajmzVoAFUlE7c_fAt > > > > Just FYI. > > The NTFS patch is no longer found and the page is reported as no longer > existing. Hmm... You can download the NTFS patch from http://btload.googlegroups.com/web/ntfs-utf8-patch.diff?gda=ImG5zEYAAABTKdAk9D4djfQOfSDW4ZV9M0mZmSlyAz1mAL3Bd_FXPiRk0z41TwnNjIphZItxmHHoNShIR6xqBMu6AvwilW_uE-Ea7GxYMt0t6nY0uV5FIQ Kevin From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 09:29:56 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B85F0106566C; Fri, 12 Nov 2010 09:29:56 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8DE808FC15; Fri, 12 Nov 2010 09:29:56 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAC9Tu2v000812; Fri, 12 Nov 2010 09:29:56 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAC9TuFv000808; Fri, 12 Nov 2010 09:29:56 GMT (envelope-from linimon) Date: Fri, 12 Nov 2010 09:29:56 GMT Message-Id: <201011120929.oAC9TuFv000808@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/152022: [nfs] nfs service hangs with linux client [regression] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 09:29:56 -0000 Old Synopsis: nfs service hangs with linux client New Synopsis: [nfs] nfs service hangs with linux client [regression] Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Nov 12 09:29:15 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=152022 From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 09:32:55 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66D2E1065675; Fri, 12 Nov 2010 09:32:55 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 3BE578FC13; Fri, 12 Nov 2010 09:32:55 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAC9WtHG002521; Fri, 12 Nov 2010 09:32:55 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAC9Wtjl002517; Fri, 12 Nov 2010 09:32:55 GMT (envelope-from linimon) Date: Fri, 12 Nov 2010 09:32:55 GMT Message-Id: <201011120932.oAC9Wtjl002517@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/152079: [msdosfs] [patch] Small cleanups from the other NetBSD/OpenBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 09:32:55 -0000 Old Synopsis: msdosfs: Small cleanups from the other NetBSD/OpenBSD New Synopsis: [msdosfs] [patch] Small cleanups from the other NetBSD/OpenBSD Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Nov 12 09:32:29 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=152079 From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 11:58:01 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 09B201065670; Fri, 12 Nov 2010 11:58:01 +0000 (UTC) (envelope-from alexz@visp.ru) Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2]) by mx1.freebsd.org (Postfix) with ESMTP id AF5828FC18; Fri, 12 Nov 2010 11:58:00 +0000 (UTC) Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin) by mail.visp.ru with esmtp (Exim 4.72 (FreeBSD)) (envelope-from ) id 1PGsGU-0004rl-Dw; Fri, 12 Nov 2010 14:57:58 +0300 From: "Alexander Zagrebin" To: , Date: Fri, 12 Nov 2010 14:57:58 +0300 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512 Thread-Index: AcuCYNju4THmbqyLQdC782dwcgeoxA== Cc: Subject: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 11:58:01 -0000 I have found that there is an issue with unmounting ZFS snapshots: the /sbin/umount "hangs" after unmounting. The test system is i386, but I can reproduce this issue on amd64 too. # uname -a FreeBSD alpha.vosz.local 8.1-STABLE FreeBSD 8.1-STABLE #0: Tue Oct 19 18:47:05 MSD 2010 root@alpha.vosz.local:/usr/obj/usr/src/sys/GENERIC i386 How to try to repeat: # zfs snapshot pool/var@test # zfs list -t all -r pool/var NAME USED AVAIL REFER MOUNTPOINT pool/var 4,86M 2,99G 4,86M /var pool/var@test 0 - 4,86M - # mount -t zfs pool/var@test /mnt # mount ... pool/var@test on /mnt (zfs, local, noatime, read-only) # umount /mnt At this point umount hangs and it's impossible to kill it even with the `kill -9`. >From the working console I can see that: 1. snapshot is unmounted successfully # mount pool/root on / (zfs, local) devfs on /dev (devfs, local, multilabel) pool/home on /home (zfs, local) pool/tmp on /tmp (zfs, local) pool/usr on /usr (zfs, local) pool/usr/src on /usr/src (zfs, local) pool/var on /var (zfs, local) 2. the umount is waiting for disk #ps | egrep 'PID|umount' PID TT STAT TIME COMMAND 958 0 D+ 0:00,04 umount /mnt # procstat -t 958 PID TID COMM TDNAME CPU PRI STATE WCHAN 958 100731 umount - 3 133 sleep mntref Can anybody confirm this issue? Any suggestions? -- Alexander Zagrebin From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 12:13:24 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 57B1C1065694; Fri, 12 Nov 2010 12:13:24 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 577A08FC31; Fri, 12 Nov 2010 12:13:23 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA28407; Fri, 12 Nov 2010 14:13:20 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4CDD2F5F.2000902@freebsd.org> Date: Fri, 12 Nov 2010 14:13:19 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Alexander Zagrebin References: In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 12:13:24 -0000 on 12/11/2010 13:57 Alexander Zagrebin said the following: > 2. the umount is waiting for disk > #ps | egrep 'PID|umount' > PID TT STAT TIME COMMAND > 958 0 D+ 0:00,04 umount /mnt > # procstat -t 958 > PID TID COMM TDNAME CPU PRI STATE WCHAN > 958 100731 umount - 3 133 sleep mntref procstat -kk > Can anybody confirm this issue? > Any suggestions? > ktrace-ing umount could also be useful. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 14:00:24 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3896F10656A8; Fri, 12 Nov 2010 14:00:24 +0000 (UTC) (envelope-from alexz@visp.ru) Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2]) by mx1.freebsd.org (Postfix) with ESMTP id 2EF348FC15; Fri, 12 Nov 2010 14:00:14 +0000 (UTC) Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin) by mail.visp.ru with esmtp (Exim 4.72 (FreeBSD)) (envelope-from ) id 1PGuAl-000Lo1-Nd; Fri, 12 Nov 2010 17:00:11 +0300 From: "Alexander Zagrebin" To: "'Andriy Gapon'" References: <4CDD2F5F.2000902@freebsd.org> Date: Fri, 12 Nov 2010 17:00:11 +0300 Keywords: freebsd-stable Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512 Thread-Index: AcuCYyqzfZYrwX5jRtKA2oGCQvAKogADjN1w In-Reply-To: <4CDD2F5F.2000902@freebsd.org> Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: RE: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 14:00:24 -0000 Thanks for your reply! > > 2. the umount is waiting for disk > > #ps | egrep 'PID|umount' > > PID TT STAT TIME COMMAND > > 958 0 D+ 0:00,04 umount /mnt > > # procstat -t 958 > > PID TID COMM TDNAME CPU PRI > STATE WCHAN > > 958 100731 umount - 3 133 > sleep mntref > > procstat -kk $ ps a | grep umount 86874 2- D 0:00,06 umount /mnt 90433 3 S+ 0:00,01 grep umount $ sudo procstat -kk 86874 PID TID COMM TDNAME KSTACK 86874 100731 umount - mi_switch+0x176 sleepq_wait+0x42 _sleep+0x317 vfs_mount_destroy+0x5a dounmount+0x4d4 unmount+0x38b syscall+0x1cf Xfast_syscall+0xe2 -- Alexander Zagrebin From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 14:27:05 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6DDA9106566B; Fri, 12 Nov 2010 14:27:05 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 562148FC18; Fri, 12 Nov 2010 14:27:03 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA00315; Fri, 12 Nov 2010 16:27:01 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4CDD4EB4.40004@freebsd.org> Date: Fri, 12 Nov 2010 16:27:00 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Alexander Zagrebin References: <4CDD2F5F.2000902@freebsd.org> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 14:27:05 -0000 on 12/11/2010 16:00 Alexander Zagrebin said the following: > Thanks for your reply! > >>> 2. the umount is waiting for disk >>> #ps | egrep 'PID|umount' >>> PID TT STAT TIME COMMAND >>> 958 0 D+ 0:00,04 umount /mnt >>> # procstat -t 958 >>> PID TID COMM TDNAME CPU PRI >> STATE WCHAN >>> 958 100731 umount - 3 133 >> sleep mntref >> >> procstat -kk > > $ ps a | grep umount > 86874 2- D 0:00,06 umount /mnt > 90433 3 S+ 0:00,01 grep umount > > $ sudo procstat -kk 86874 > PID TID COMM TDNAME KSTACK > 86874 100731 umount - mi_switch+0x176 > sleepq_wait+0x42 _sleep+0x317 vfs_mount_destroy+0x5a dounmount+0x4d4 > unmount+0x38b syscall+0x1cf Xfast_syscall+0xe2 > Looks like possible mnt_ref leak. I think that something like that was fixed some not long time ago. Perhaps you either don't have the fix or there is another leak. What revision do you have? Perhaps Martin has an insight here. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 17:43:44 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0D9DB106566B; Fri, 12 Nov 2010 17:43:44 +0000 (UTC) (envelope-from sarawgi.aditya@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id BA7448FC17; Fri, 12 Nov 2010 17:43:43 +0000 (UTC) Received: by iwn39 with SMTP id 39so3747096iwn.13 for ; Fri, 12 Nov 2010 09:43:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=cp0173sigjDNVvCPuetu4pnYqA2uRnHVLIN2JtltlCI=; b=REPCLxRLYMoj/tRqLJ2JckzEMIXwyB3o5MPyKsiIuVLzBI2818Dn/jQyABIR1zGB2y nrZpWSG8NQWciGTTAU0DnQWDEmuHAECHpL0Z5o4K5AW59MpERsaNKHJ18Gjz1BIjPyrM Sp2OAGCXISALPM3W2VONkcuCSmovKjSiYvISA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=ddX2x7SLfOIlErFnrk3fj90t0GZvGH32RS4npSUH4HzvaJTeKhmmBfJz0TTUS+8Umq sLTdkmrqfvjYGpl+FuBztrXW2TWb/SB9LNK70XIIstHOT1ZrOuZt3cTDOkRKSCREJLMd 6t2uGSLR3DLFpUakcmWXOXvXCvA/IxGNlwtLE= MIME-Version: 1.0 Received: by 10.231.183.136 with SMTP id cg8mr2267704ibb.114.1289583822776; Fri, 12 Nov 2010 09:43:42 -0800 (PST) Received: by 10.231.253.81 with HTTP; Fri, 12 Nov 2010 09:43:42 -0800 (PST) In-Reply-To: <4CDB888E.6030005@FreeBSD.org> References: <4CDB888E.6030005@FreeBSD.org> Date: Fri, 12 Nov 2010 23:13:42 +0530 Message-ID: From: Aditya Sarawgi To: Doug Barton Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Minor problem re-mounting ext2fs system X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 17:43:44 -0000 Hi, On Thu, Nov 11, 2010 at 11:39 AM, Doug Barton wrote: > Even after a clean shutdown I get this error at nearly every reboot: > > kernel: last write time is in the future. > kernel: (by less than a day, probably due to the hardware clock being > incorrectly set). > kernel: FIXED. > The ext2fs superblock (the structure that keeps summary and important information for the filesystem) has one of the field has timestamp. Now clearly it doesn't make any sense for the filesystem getting mounted to have a future timestamp. It regards this has inconsistency and hence you need to fsck it. You need to figure out why is it taking a future timestamp. You can analyse the superblock by dumpe2fs from e2fsprogs. > It does get fixed, but it requires the fsck program from e2fsprogs to do = it. > > > Doug > > -- > > =A0 =A0 =A0 =A0Nothin' ever doesn't change, but nothin' changes much. > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0-- OK Go > > =A0 =A0 =A0 =A0Breadth of IT experience, and depth of knowledge in the DN= S. > =A0 =A0 =A0 =A0Yours for the right price. =A0:) =A0http://SupersetSolutio= ns.com/ > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > --=20 Cheers, Aditya Sarawgi From owner-freebsd-fs@FreeBSD.ORG Fri Nov 12 22:35:20 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 10623106566C for ; Fri, 12 Nov 2010 22:35:20 +0000 (UTC) (envelope-from gull@gull.us) Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 9F2488FC17 for ; Fri, 12 Nov 2010 22:35:19 +0000 (UTC) Received: by ewy3 with SMTP id 3so665709ewy.13 for ; Fri, 12 Nov 2010 14:35:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.14.124.201 with SMTP id x49mr2089424eeh.7.1289599652793; Fri, 12 Nov 2010 14:07:32 -0800 (PST) Received: by 10.14.127.1 with HTTP; Fri, 12 Nov 2010 14:07:32 -0800 (PST) X-Originating-IP: [69.91.159.208] In-Reply-To: <4CDB6A5F.2000908@FreeBSD.org> References: <20100929031825.L683@besplex.bde.org> <20100929084801.M948@besplex.bde.org> <20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org> <20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org> <4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth> <4CD9E535.8000801@FreeBSD.org> <20101110170719.GA1573@earth> <4CDB6A5F.2000908@FreeBSD.org> Date: Fri, 12 Nov 2010 14:07:32 -0800 Message-ID: From: David Brodbeck To: fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: Re: ext2fs now extremely slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Nov 2010 22:35:20 -0000 On Wed, Nov 10, 2010 at 8:00 PM, Doug Barton wrote: >> May make the data inconsistent due to lack of facilities like journaling. > > Well that's just plain unacceptable. Either the fs works reliably (which > obviously includes safely) or it should be removed. At bare minimum if it > can't reliably write data then support should be changed to read-only. ext2fs has never included journaling, so if you lose power while you're writing to it, it will be inconsistent and need to be fsck'd. This isn't unique to the FreeBSD implementation; it's just part of the design. Most Linux systems now use ext3fs, which is basically ext2 with journaling added. I kind of share Aditya's perspective that what you're trying to do is a bit odd, although it might be a good way to squash bugs. Still, what's next...trying to run make world on an msdos filesystem? ;) From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 02:27:12 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03C751065672; Sat, 13 Nov 2010 02:27:12 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 459958FC16; Sat, 13 Nov 2010 02:27:11 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 4405E11C005; Sat, 13 Nov 2010 03:27:08 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Qe5JL+6kd74J; Sat, 13 Nov 2010 03:27:06 +0100 (CET) Received: from [10.9.8.1] (188-167-78-139.dynamic.chello.sk [188.167.78.139]) by mail.vx.sk (Postfix) with ESMTPSA id B0C8511BFFB; Sat, 13 Nov 2010 03:27:05 +0100 (CET) Message-ID: <4CDDF77B.90708@FreeBSD.org> Date: Sat, 13 Nov 2010 03:27:07 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk; rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0 MIME-Version: 1.0 To: Andriy Gapon References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> In-Reply-To: <4CDD4EB4.40004@freebsd.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 02:27:12 -0000 Yes, this is indeed a leak introduced by importing onnv revision 9214 and it exists in perforce as well - very easy to reproduce. # mount -t zfs test@t1 /mnt # umount /mnt (-> hang) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367 This is not compatible with mounting snapshots outside mounted ZFS and I was not able to reproduce the errors defined in 6604992 and 6810367 (they are Solaris-specific). I suggest we comment out this code (from head, later MFC and p4 as well). Patch (should work with HEAD and 8-STABLE): http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch Dňa 12.11.2010 15:27, Andriy Gapon wrote / napísal(a): > on 12/11/2010 16:00 Alexander Zagrebin said the following: >> Thanks for your reply! >> >>>> 2. the umount is waiting for disk >>>> #ps | egrep 'PID|umount' >>>> PID TT STAT TIME COMMAND >>>> 958 0 D+ 0:00,04 umount /mnt >>>> # procstat -t 958 >>>> PID TID COMM TDNAME CPU PRI >>> STATE WCHAN >>>> 958 100731 umount - 3 133 >>> sleep mntref >>> >>> procstat -kk >> >> $ ps a | grep umount >> 86874 2- D 0:00,06 umount /mnt >> 90433 3 S+ 0:00,01 grep umount >> >> $ sudo procstat -kk 86874 >> PID TID COMM TDNAME KSTACK >> 86874 100731 umount - mi_switch+0x176 >> sleepq_wait+0x42 _sleep+0x317 vfs_mount_destroy+0x5a dounmount+0x4d4 >> unmount+0x38b syscall+0x1cf Xfast_syscall+0xe2 >> > > > Looks like possible mnt_ref leak. > I think that something like that was fixed some not long time ago. > Perhaps you either don't have the fix or there is another leak. > What revision do you have? > > Perhaps Martin has an insight here. > From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 06:30:53 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 86829106566B; Sat, 13 Nov 2010 06:30:53 +0000 (UTC) (envelope-from alexz@visp.ru) Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2]) by mx1.freebsd.org (Postfix) with ESMTP id EBA288FC1B; Sat, 13 Nov 2010 06:30:52 +0000 (UTC) Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin) by mail.visp.ru with esmtp (Exim 4.72 (FreeBSD)) (envelope-from ) id 1PH9dS-000Ivu-4I; Sat, 13 Nov 2010 09:30:50 +0300 From: "Alexander Zagrebin" To: "'Martin Matuska'" References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org> Date: Sat, 13 Nov 2010 09:30:49 +0300 Keywords: freebsd-stable Message-ID: <8EEEFFFCCE94428992BE20F4A2EB8362@vosz.local> MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512 Thread-Index: AcuC2oQL0Tr2Iry0RMqsIX6jZG2/hgAGlavQ In-Reply-To: <4CDDF77B.90708@FreeBSD.org> Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org, 'Andriy Gapon' Subject: RE: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 06:30:53 -0000 > Yes, this is indeed a leak introduced by importing onnv revision 9214 > and it exists in perforce as well - very easy to reproduce. > > # mount -t zfs test@t1 /mnt > # umount /mnt (-> hang) > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367 > > This is not compatible with mounting snapshots outside > mounted ZFS and I > was not able to reproduce the errors defined in 6604992 and 6810367 > (they are Solaris-specific). I suggest we comment out this code (from > head, later MFC and p4 as well). > > Patch (should work with HEAD and 8-STABLE): > http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch > The patch was applied cleanly to the latest stable. umount doesn't hangs now. Thanks. Let me ask a question... I'm updating the source tree via csup/cvs. Is there a method to determine a SVN revision in this case? If no, then may be possible to add (and automatically maintain on svn -> cvs replication) special file into cvs tree (for example, /usr/src/revision) with the current svn revision inside? -- Alexander Zagrebin From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 09:57:11 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9347106566B; Sat, 13 Nov 2010 09:57:11 +0000 (UTC) (envelope-from arundel@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BE71E8FC17; Sat, 13 Nov 2010 09:57:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAD9vBeE031642; Sat, 13 Nov 2010 09:57:11 GMT (envelope-from arundel@freefall.freebsd.org) Received: (from arundel@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAD9vBnF031638; Sat, 13 Nov 2010 09:57:11 GMT (envelope-from arundel) Date: Sat, 13 Nov 2010 09:57:11 GMT Message-Id: <201011130957.oAD9vBnF031638@freefall.freebsd.org> To: arundel@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: arundel@FreeBSD.org Cc: Subject: Re: bin/151713: [patch] Bug in growfs(8) with respect to 32-bit overflow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 09:57:12 -0000 Synopsis: [patch] Bug in growfs(8) with respect to 32-bit overflow Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: arundel Responsible-Changed-When: Sat Nov 13 09:55:47 UTC 2010 Responsible-Changed-Why: Assign to freebsd-fs@, since they should have an opinion regarding this issue. http://www.freebsd.org/cgi/query-pr.cgi?pr=151713 From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 10:29:10 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AECBF1065673; Sat, 13 Nov 2010 10:29:10 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 921488FC12; Sat, 13 Nov 2010 10:29:09 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA11999; Sat, 13 Nov 2010 12:29:08 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PHDM4-000Mtu-1O; Sat, 13 Nov 2010 12:29:08 +0200 Message-ID: <4CDE6823.6080907@freebsd.org> Date: Sat, 13 Nov 2010 12:27:47 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Martin Matuska References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org> In-Reply-To: <4CDDF77B.90708@FreeBSD.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 10:29:10 -0000 on 13/11/2010 04:27 Martin Matuska said the following: > Yes, this is indeed a leak introduced by importing onnv revision 9214 > and it exists in perforce as well - very easy to reproduce. > > # mount -t zfs test@t1 /mnt > # umount /mnt (-> hang) > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367 > > This is not compatible with mounting snapshots outside mounted ZFS and I > was not able to reproduce the errors defined in 6604992 and 6810367 > (they are Solaris-specific). I suggest we comment out this code (from > head, later MFC and p4 as well). > > Patch (should work with HEAD and 8-STABLE): > http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch Not quite sure, but perhaps it's better to make the logic in each place match the other. That is, I see that the code does hold on a filesystem of a covered vnode, but does rele on a parent ZFS filesystem. Or is this kind of protection not needed at all for FreeBSD? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 11:06:33 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BAFB106564A; Sat, 13 Nov 2010 11:06:33 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 906F18FC0A; Sat, 13 Nov 2010 11:06:32 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id B9FBE122D76; Sat, 13 Nov 2010 12:06:31 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 8gpd45QMWnIz; Sat, 13 Nov 2010 12:06:27 +0100 (CET) Received: from [10.9.8.1] (188-167-78-139.dynamic.chello.sk [188.167.78.139]) by mail.vx.sk (Postfix) with ESMTPSA id 0D0E3122D5E; Sat, 13 Nov 2010 12:06:27 +0100 (CET) Message-ID: <4CDE7133.6010803@FreeBSD.org> Date: Sat, 13 Nov 2010 12:06:27 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk; rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0 MIME-Version: 1.0 To: Andriy Gapon References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org> <4CDE6823.6080907@freebsd.org> In-Reply-To: <4CDE6823.6080907@freebsd.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 11:06:33 -0000 No, this is not good for us. Solaris does not allow "mounting" of snapshots on any vnode, like we do. Solaris has them only in .zfs/snapshots. This allows us to have read-only mounts without even mounting the parent zfs. Before v15 we have been happy with that code and had no issues :-) I have a very simple testcase where just fixing the VFS_RELE breaks our forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c: VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp); Now let's say you have a mounted filesystem (e.g. md) under /mnt: /dev/md5 on /mnt (ufs, local) # mkdir /mnt/test # mount -t zfs tank@t2 /mnt/test # umount -f /mnt Now you will hang because the second VFS_HOLD. So I stick to my opinion that this "extra protection" is more a problem than a solution in our case and it should be commented out. Dňa 13.11.2010 11:27, Andriy Gapon wrote / napísal(a): > on 13/11/2010 04:27 Martin Matuska said the following: >> Yes, this is indeed a leak introduced by importing onnv revision 9214 >> and it exists in perforce as well - very easy to reproduce. >> >> # mount -t zfs test@t1 /mnt >> # umount /mnt (-> hang) >> >> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992 >> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367 >> >> This is not compatible with mounting snapshots outside mounted ZFS and I >> was not able to reproduce the errors defined in 6604992 and 6810367 >> (they are Solaris-specific). I suggest we comment out this code (from >> head, later MFC and p4 as well). >> >> Patch (should work with HEAD and 8-STABLE): >> http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch > > Not quite sure, but perhaps it's better to make the logic in each place match > the other. That is, I see that the code does hold on a filesystem of a covered > vnode, but does rele on a parent ZFS filesystem. > Or is this kind of protection not needed at all for FreeBSD? > From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 11:11:19 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 856F21065701; Sat, 13 Nov 2010 11:11:19 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 6626C8FC08; Sat, 13 Nov 2010 11:11:17 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA12371; Sat, 13 Nov 2010 13:11:16 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PHE0q-000Mwy-MH; Sat, 13 Nov 2010 13:11:16 +0200 Message-ID: <4CDE7203.7090507@freebsd.org> Date: Sat, 13 Nov 2010 13:09:55 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Martin Matuska References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org> <4CDE6823.6080907@freebsd.org> <4CDE7133.6010803@FreeBSD.org> In-Reply-To: <4CDE7133.6010803@FreeBSD.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 11:11:19 -0000 on 13/11/2010 13:06 Martin Matuska said the following: > No, this is not good for us. Solaris does not allow "mounting" of > snapshots on any vnode, like we do. Solaris has them only in > .zfs/snapshots. This allows us to have read-only mounts without even > mounting the parent zfs. > > Before v15 we have been happy with that code and had no issues :-) > > I have a very simple testcase where just fixing the VFS_RELE breaks our > forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c: > VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp); > > Now let's say you have a mounted filesystem (e.g. md) under /mnt: > /dev/md5 on /mnt (ufs, local) > > # mkdir /mnt/test > # mount -t zfs tank@t2 /mnt/test > # umount -f /mnt > > Now you will hang because the second VFS_HOLD. Hang here would be bad, I agree. But I think that the umount shouldn't succeed either, in this case. > So I stick to my opinion > that this "extra protection" is more a problem than a solution in our > case and it should be commented out. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 11:21:18 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A350D1065679; Sat, 13 Nov 2010 11:21:18 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 3794E8FC28; Sat, 13 Nov 2010 11:21:17 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id oADBL47a079627 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 13 Nov 2010 13:21:04 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id oADBL4hj036568; Sat, 13 Nov 2010 13:21:04 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id oADBL4tO036567; Sat, 13 Nov 2010 13:21:04 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 13 Nov 2010 13:21:04 +0200 From: Kostik Belousov To: Andriy Gapon Message-ID: <20101113112104.GE2392@deviant.kiev.zoral.com.ua> References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org> <4CDE6823.6080907@freebsd.org> <4CDE7133.6010803@FreeBSD.org> <4CDE7203.7090507@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="QrrxbCYKnJeJBlX9" Content-Disposition: inline In-Reply-To: <4CDE7203.7090507@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 11:21:18 -0000 --QrrxbCYKnJeJBlX9 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Nov 13, 2010 at 01:09:55PM +0200, Andriy Gapon wrote: > on 13/11/2010 13:06 Martin Matuska said the following: > > No, this is not good for us. Solaris does not allow "mounting" of > > snapshots on any vnode, like we do. Solaris has them only in > > .zfs/snapshots. This allows us to have read-only mounts without even > > mounting the parent zfs. > >=20 > > Before v15 we have been happy with that code and had no issues :-) > >=20 > > I have a very simple testcase where just fixing the VFS_RELE breaks our > > forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c: > > VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp); > >=20 > > Now let's say you have a mounted filesystem (e.g. md) under /mnt: > > /dev/md5 on /mnt (ufs, local) > >=20 > > # mkdir /mnt/test > > # mount -t zfs tank@t2 /mnt/test > > # umount -f /mnt > >=20 > > Now you will hang because the second VFS_HOLD. >=20 > Hang here would be bad, I agree. > But I think that the umount shouldn't succeed either, in this case. Normal unmount indeed shall not succeed in this case, because mount adds a reference to the covered vnode. But forced unmount should be allowed to proceed. After unmount, you can use fsid to unmount the lower mount point. >=20 > > So I stick to my opinion > > that this "extra protection" is more a problem than a solution in our > > case and it should be commented out. >=20 >=20 > --=20 > Andriy Gapon > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" --QrrxbCYKnJeJBlX9 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAkzedJ8ACgkQC3+MBN1Mb4iVbACg9BjzaWe4CKTTgoiDq/g3eJab gxIAoPIu6gsaPqGxSYGORw1XUPtuAgSx =P5Rh -----END PGP SIGNATURE----- --QrrxbCYKnJeJBlX9-- From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 11:26:48 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B7D6D1065695; Sat, 13 Nov 2010 11:26:48 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 8C9258FC20; Sat, 13 Nov 2010 11:26:47 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA12514; Sat, 13 Nov 2010 13:26:45 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1PHEFo-000MyQ-NM; Sat, 13 Nov 2010 13:26:44 +0200 Message-ID: <4CDE75A4.8050702@freebsd.org> Date: Sat, 13 Nov 2010 13:25:24 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Kostik Belousov References: <4CDD2F5F.2000902@freebsd.org> <4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org> <4CDE6823.6080907@freebsd.org> <4CDE7133.6010803@FreeBSD.org> <4CDE7203.7090507@freebsd.org> <20101113112104.GE2392@deviant.kiev.zoral.com.ua> In-Reply-To: <20101113112104.GE2392@deviant.kiev.zoral.com.ua> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 11:26:48 -0000 on 13/11/2010 13:21 Kostik Belousov said the following: > On Sat, Nov 13, 2010 at 01:09:55PM +0200, Andriy Gapon wrote: >> on 13/11/2010 13:06 Martin Matuska said the following: >>> No, this is not good for us. Solaris does not allow "mounting" of >>> snapshots on any vnode, like we do. Solaris has them only in >>> .zfs/snapshots. This allows us to have read-only mounts without even >>> mounting the parent zfs. >>> >>> Before v15 we have been happy with that code and had no issues :-) >>> >>> I have a very simple testcase where just fixing the VFS_RELE breaks our >>> forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c: >>> VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp); >>> >>> Now let's say you have a mounted filesystem (e.g. md) under /mnt: >>> /dev/md5 on /mnt (ufs, local) >>> >>> # mkdir /mnt/test >>> # mount -t zfs tank@t2 /mnt/test >>> # umount -f /mnt >>> >>> Now you will hang because the second VFS_HOLD. >> >> Hang here would be bad, I agree. >> But I think that the umount shouldn't succeed either, in this case. > Normal unmount indeed shall not succeed in this case, because mount > adds a reference to the covered vnode. But forced unmount should be > allowed to proceed. > > After unmount, you can use fsid to unmount the lower mount point. Ah, I see now, thank you for the explanation. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Sat Nov 13 23:35:47 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F283106566B for ; Sat, 13 Nov 2010 23:35:47 +0000 (UTC) (envelope-from TERRY@tmk.com) Received: from server.tmk.com (server.tmk.com [204.141.35.63]) by mx1.freebsd.org (Postfix) with ESMTP id 4BC398FC0A for ; Sat, 13 Nov 2010 23:35:46 +0000 (UTC) Received: from tmk.com by tmk.com (PMDF V6.4 #37010) id <01NU7T8XRYKW00BCHX@tmk.com>; Sat, 13 Nov 2010 18:01:59 -0500 (EST) Date: Sat, 13 Nov 2010 18:00:29 -0500 (EST) From: Terry Kennedy To: freebsd-stable@freebsd.org, freebsd-fs@freebsd.org Message-id: <01NU7TBBN3D000BCHX@tmk.com> MIME-version: 1.0 Content-type: TEXT/PLAIN; CHARSET=us-ascii Cc: Subject: ZFS panic after replacing log device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2010 23:35:47 -0000 I'm posting this to the freebsd-stable and freebsd-fs mailing lists. Followups should probably happen on freebsd-fs. I have a ZFS pool configured as: zpool create data raidz da1 da2 da3 da4 da5 raidz da6 da7 da8 da9 da10 raidz da11 da12 da13 da14 da15 spare da16 log da0 where da1-16 are WD2003FYYS drives (2TB RE4) and da0 is a 256GB PCI-Express SSD (name omitted to protect the guilty). The SSD has been dropping offline randomly - it seems that one or more flash modules pop out of their sockets and need to be re-seated frequently for some reason. The most recent time it did that, I replaced the SSD with another one (for some reason, the manufacturer ties the flash modules to a particular controller, so just moving the modules results in an offline SSD and inability to manage it due to "license limits exceeded" or some such nonsense). ZFS wasn't happy with the log device being changed, and reported it as corrupted, with the suggested corrective action being to "zpool clear" it. I did that, and then did a "zpool replace data da0 da0" and it claimed to successfully resilver it. I then did a "zpool scrub" and the scrub completed with no errors. So far, so good. However, any attempt to write to the array results in a near-immediate panic: panic: solaris assert: sm->sm_spare + size <= sm->sm_size, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line: 93 cpuid=2 (Screenshot at http://www.tmk.com/transient/zfs-panic.png in case I mis-typed something). This is repeatable across reboot / scrub / test cycles. System is 8-STABLE as of Fri Nov 5 19:08:35 EDT 2010, on-disk pool is version 4/15, same as the kernel. I know that certain operations on log devices aren't supported until pool version 19 or thereabouts, but the error messages and zpool command results gave the impression that what I was doing was supported and worked (when it didn't). If this is truly a "you can't do that in pool version 15", perhaps a warning could be added so users don't get fooled into thinking it worked? I can give a developer remote console / root access to the box if that would help. I have a couple days before I will need to nuke the pool and restore it from backups. Terry Kennedy http://www.tmk.com terry@tmk.com New York, NY USA