From owner-freebsd-fs@FreeBSD.ORG Sun Jul 26 11:49:54 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 43D111065672; Sun, 26 Jul 2009 11:49:54 +0000 (UTC) (envelope-from snb@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 17E668FC15; Sun, 26 Jul 2009 11:49:54 +0000 (UTC) (envelope-from snb@FreeBSD.org) Received: from freefall.freebsd.org (snb@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6QBnr4T007486; Sun, 26 Jul 2009 11:49:53 GMT (envelope-from snb@freefall.freebsd.org) Received: (from snb@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6QBnrBd007482; Sun, 26 Jul 2009 11:49:53 GMT (envelope-from snb) Date: Sun, 26 Jul 2009 11:49:53 GMT Message-Id: <200907261149.n6QBnrBd007482@freefall.freebsd.org> To: royce@tycho.org, snb@FreeBSD.org, freebsd-fs@FreeBSD.org From: snb@FreeBSD.org Cc: Subject: Re: kern/137034: [ufs] [lor] lock-order reversal near vfs_bio.c, ufs_dirhash.c X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Jul 2009 11:49:54 -0000 Synopsis: [ufs] [lor] lock-order reversal near vfs_bio.c, ufs_dirhash.c State-Changed-From-To: open->closed State-Changed-By: snb State-Changed-When: Sön 26 Jul 2009 11:39:52 UTC State-Changed-Why: This LOR cannot result in deadlock: http://sources.zabbadoz.net/freebsd/lor/261.html http://www.freebsd.org/cgi/query-pr.cgi?pr=137034 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 04:43:46 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E39C710656C3; Mon, 27 Jul 2009 04:43:46 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BAA078FC26; Mon, 27 Jul 2009 04:43:46 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6R4hkIF086876; Mon, 27 Jul 2009 04:43:46 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6R4hkDc086872; Mon, 27 Jul 2009 04:43:46 GMT (envelope-from linimon) Date: Mon, 27 Jul 2009 04:43:46 GMT Message-Id: <200907270443.n6R4hkDc086872@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-i386@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/137037: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 04:43:47 -0000 Old Synopsis: zfs rolback on root causes FreeBSD to freez in few seconds New Synopsis: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds Responsible-Changed-From-To: freebsd-i386->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jul 27 04:43:14 UTC 2009 Responsible-Changed-Why: reclassify. http://www.freebsd.org/cgi/query-pr.cgi?pr=137037 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 07:43:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 668FF1065673; Mon, 27 Jul 2009 07:43:48 +0000 (UTC) (envelope-from lists@jpru.de) Received: from jpru.ffm.jpru.de (jpru.ffm.jpru.de [195.49.136.33]) by mx1.freebsd.org (Postfix) with ESMTP id CE7608FC22; Mon, 27 Jul 2009 07:43:47 +0000 (UTC) (envelope-from lists@jpru.de) Received: from jpru.ffm.jpru.de (jpru.ffm.jpru.de [195.49.136.33]) by jpru.ffm.jpru.de (8.13.8/8.13.8) with ESMTP id n6R7P3dr052613; Mon, 27 Jul 2009 09:25:03 +0200 (CEST) (envelope-from lists@jpru.de) Received: (from unger@localhost) by jpru.ffm.jpru.de (8.13.8/8.13.8/Submit) id n6R7P3N2052612; Mon, 27 Jul 2009 09:25:03 +0200 (CEST) (envelope-from lists@jpru.de) X-Authentication-Warning: jpru.ffm.jpru.de: unger set sender to lists@jpru.de using -f Date: Mon, 27 Jul 2009 09:25:03 +0200 From: Juergen Unger To: freebsd-current@freebsd.org, freebsd-fs@freebsd.org Message-ID: <20090727072503.GA52309@jpru.ffm.jpru.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="OXfL5xGRrasGEqWY" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 07:43:48 -0000 --OXfL5xGRrasGEqWY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, I have one box where I am doing an zfs-receive job every five minutes for each of eleven zvols. Beside this no other service runs on this box. The system is an current checked out Jul 25 09:11 CET compiled with the following options: > include GENERIC > options KVA_PAGES=3D512 > options KDB > options DDB uname -a: > FreeBSD testbox 8.0-BETA2 FreeBSD 8.0-BETA2 #0: Sat Jul 25 21:43:42 CEST = 2009 root@testbox:/usr/obj/usr/src/sys/ZFS-DEBUG i386 loader.conf: > geom_mirror_load=3D"YES" > vm.kmem_size=3D"1536M" > vm.kmem_size_max=3D"1536M" > vfs.zfs.arc_max=3D"100M" > vfs.zfs.prefetch_disable=3D1 This runs quite well for a few hours but after max 20 to 30 hours I get this error: > Fatal trap 12: page fault while in kernel mode > cpuid =3D 0; apic id =3D 00 > fault virtual address =3D 0x4c > fault code =3D supervisor read, page not present > instruction pointer =3D 0x20:0x80883d93 > stack pointer =3D 0x28:0xfcd29b74 > frame pointer =3D 0x28:0xfcd29b94 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, def32 1, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 36 (vnlru) > [thread pid 36 tid 100062 ] > Stopped at _sx_xlock+0x43: movl 0x10(%ebx),%eax > db> bt > Tracing pid 36 tid 100062 td 0x87166480 > _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 > dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 > zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_znode= _dmu_f3 > zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebsd_re= claim+0 > VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_APV+0x= a5 > vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 > vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 > vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 > fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0, eip =3D 0, esp =3D 0xfcd29d70, ebp =3D 0 --- > db>=20 any suggestions ? Juergen --OXfL5xGRrasGEqWY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKbVZOvt1rOHDRROsRAtmOAKDBBFERiQdhVC9sdqQGaVA9Vtq0CwCg95eR 72V/Qrx/IWVCVqkN/aM0W3w= =fmJg -----END PGP SIGNATURE----- --OXfL5xGRrasGEqWY-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 11:06:52 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EEC561065676 for ; Mon, 27 Jul 2009 11:06:52 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DBB408FC35 for ; Mon, 27 Jul 2009 11:06:52 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6RB6qGR018918 for ; Mon, 27 Jul 2009 11:06:52 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6RB6qSn018914 for freebsd-fs@FreeBSD.org; Mon, 27 Jul 2009 11:06:52 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 27 Jul 2009 11:06:52 GMT Message-Id: <200907271106.n6RB6qSn018914@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 11:06:53 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136942 fs [zfs] zvol resize not reflected until reboot o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels f kern/133020 fs [zfs] [panic] inappropriate panic caused by zfs. Pani o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132337 fs [zfs] [panic] kernel panic in zfs_fuid_create_cred o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 151 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 13:40:03 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 637801065672 for ; Mon, 27 Jul 2009 13:40:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 37CB98FC14 for ; Mon, 27 Jul 2009 13:40:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6RDe3SF042555 for ; Mon, 27 Jul 2009 13:40:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6RDe3YR042554; Mon, 27 Jul 2009 13:40:03 GMT (envelope-from gnats) Date: Mon, 27 Jul 2009 13:40:03 GMT Message-Id: <200907271340.n6RDe3YR042554@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: John Baldwin Cc: Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John Baldwin List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 13:40:03 -0000 The following reply was made to PR kern/136945; it has been noted by GNATS. From: John Baldwin To: bug-followup@freebsd.org, rene@freebsd.org Cc: Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) Date: Mon, 27 Jul 2009 08:32:26 -0400 I would actually expect this to be the correct order for these two locks. Can you capture the output of the 'debug.witness.fullgraph' sysctl to a file? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 14:00:05 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C20A81065674 for ; Mon, 27 Jul 2009 14:00:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 958DE8FC2D for ; Mon, 27 Jul 2009 14:00:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6RE05sT056473 for ; Mon, 27 Jul 2009 14:00:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6RE05Rv056472; Mon, 27 Jul 2009 14:00:05 GMT (envelope-from gnats) Date: Mon, 27 Jul 2009 14:00:05 GMT Message-Id: <200907271400.n6RE05Rv056472@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Rene Ladan Cc: Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Rene Ladan List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 14:00:06 -0000 The following reply was made to PR kern/136945; it has been noted by GNATS. From: Rene Ladan To: John Baldwin Cc: bug-followup@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) Date: Mon, 27 Jul 2009 15:51:15 +0200 2009/7/27 John Baldwin : > I would actually expect this to be the correct order for these two locks.= =A0Can > you capture the output of the 'debug.witness.fullgraph' sysctl to a file? > Yes, see attachment. I'm still running the same 8.0-BETA2. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 14:53:25 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA338106566C; Mon, 27 Jul 2009 14:53:23 +0000 (UTC) (envelope-from pjd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8FCFD8FC17; Mon, 27 Jul 2009 14:53:23 +0000 (UTC) (envelope-from pjd@FreeBSD.org) Received: from freefall.freebsd.org (pjd@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6RErNuJ001746; Mon, 27 Jul 2009 14:53:23 GMT (envelope-from pjd@freefall.freebsd.org) Received: (from pjd@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6RErNs2001742; Mon, 27 Jul 2009 14:53:23 GMT (envelope-from pjd) Date: Mon, 27 Jul 2009 14:53:23 GMT Message-Id: <200907271453.n6RErNs2001742@freefall.freebsd.org> To: pjd@FreeBSD.org, freebsd-fs@FreeBSD.org, pjd@FreeBSD.org From: pjd@FreeBSD.org Cc: Subject: Re: kern/132337: [zfs] [panic] kernel panic in zfs_fuid_create_cred X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 14:53:26 -0000 Synopsis: [zfs] [panic] kernel panic in zfs_fuid_create_cred State-Changed-From-To: open->patched State-Changed-By: pjd State-Changed-When: pon 27 lip 14:52:49 2009 UTC State-Changed-Why: Fix committed to HEAD. Responsible-Changed-From-To: freebsd-fs->pjd Responsible-Changed-By: pjd Responsible-Changed-When: pon 27 lip 14:52:49 2009 UTC Responsible-Changed-Why: I'll take this one. http://www.freebsd.org/cgi/query-pr.cgi?pr=132337 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 14:55:00 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FABD1065673; Mon, 27 Jul 2009 14:55:00 +0000 (UTC) (envelope-from pjd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 139A28FC2B; Mon, 27 Jul 2009 14:55:00 +0000 (UTC) (envelope-from pjd@FreeBSD.org) Received: from freefall.freebsd.org (pjd@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6REsxd3001830; Mon, 27 Jul 2009 14:54:59 GMT (envelope-from pjd@freefall.freebsd.org) Received: (from pjd@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6REsxM1001826; Mon, 27 Jul 2009 14:54:59 GMT (envelope-from pjd) Date: Mon, 27 Jul 2009 14:54:59 GMT Message-Id: <200907271454.n6REsxM1001826@freefall.freebsd.org> To: gwenzel@univaud.com, pjd@FreeBSD.org, freebsd-fs@FreeBSD.org, pjd@FreeBSD.org From: pjd@FreeBSD.org Cc: Subject: Re: kern/133020: [zfs] [panic] inappropriate panic caused by zfs. Panic: zfs_fuid_create X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 14:55:00 -0000 Synopsis: [zfs] [panic] inappropriate panic caused by zfs. Panic: zfs_fuid_create State-Changed-From-To: feedback->closed State-Changed-By: pjd State-Changed-When: pon 27 lip 14:53:56 2009 UTC State-Changed-Why: Duplicate of 132337, which is now fixed in HEAD. Responsible-Changed-From-To: freebsd-fs->pjd Responsible-Changed-By: pjd Responsible-Changed-When: pon 27 lip 14:53:56 2009 UTC Responsible-Changed-Why: I'll take this one. http://www.freebsd.org/cgi/query-pr.cgi?pr=133020 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 20:15:30 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B5D0106566B for ; Mon, 27 Jul 2009 20:15:30 +0000 (UTC) (envelope-from ohartman@mail.zedat.fu-berlin.de) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) by mx1.freebsd.org (Postfix) with ESMTP id 15B0D8FC13 for ; Mon, 27 Jul 2009 20:15:29 +0000 (UTC) (envelope-from ohartman@mail.zedat.fu-berlin.de) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1MVWL9-0006MM-9H>; Mon, 27 Jul 2009 21:58:31 +0200 Received: from e178050008.adsl.alicedsl.de ([85.178.50.8] helo=thor.walstatt.dyndns.org) by inpost2.zedat.fu-berlin.de (Exim 4.69) with esmtpsa (envelope-from ) id <1MVWL9-00062e-6j>; Mon, 27 Jul 2009 21:58:31 +0200 Message-ID: <4A6E06E6.9030300@mail.zedat.fu-berlin.de> Date: Mon, 27 Jul 2009 21:58:30 +0200 From: "O. Hartmann" User-Agent: Thunderbird 2.0.0.22 (X11/20090723) MIME-Version: 1.0 To: Juergen Unger References: <20090727072503.GA52309@jpru.ffm.jpru.de> In-Reply-To: <20090727072503.GA52309@jpru.ffm.jpru.de> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Originating-IP: 85.178.50.8 Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org, spambox@haruhiism.net Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 20:15:30 -0000 Juergen Unger wrote: > Hi, > > I have one box where I am doing an zfs-receive job every > five minutes for each of eleven zvols. Beside this no > other service runs on this box. > The system is an current checked out Jul 25 09:11 CET > compiled with the following options: > >> include GENERIC >> options KVA_PAGES=512 >> options KDB >> options DDB > > uname -a: > >> FreeBSD testbox 8.0-BETA2 FreeBSD 8.0-BETA2 #0: Sat Jul 25 21:43:42 CEST 2009 root@testbox:/usr/obj/usr/src/sys/ZFS-DEBUG i386 > > loader.conf: > >> geom_mirror_load="YES" >> vm.kmem_size="1536M" >> vm.kmem_size_max="1536M" >> vfs.zfs.arc_max="100M" >> vfs.zfs.prefetch_disable=1 > > This runs quite well for a few hours but after max 20 to 30 > hours I get this error: > >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x4c >> fault code = supervisor read, page not present >> instruction pointer = 0x20:0x80883d93 >> stack pointer = 0x28:0xfcd29b74 >> frame pointer = 0x28:0xfcd29b94 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, def32 1, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 36 (vnlru) >> [thread pid 36 tid 100062 ] >> Stopped at _sx_xlock+0x43: movl 0x10(%ebx),%eax >> db> bt >> Tracing pid 36 tid 100062 td 0x87166480 >> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 >> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 >> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_znode_dmu_f3 >> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebsd_reclaim+0 >> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_APV+0xa5 >> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 >> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 >> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 >> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 >> fork_trampoline() at fork_trampoline+0x8 >> --- trap 0, eip = 0, esp = 0xfcd29d70, ebp = 0 --- >> db> > > any suggestions ? > > Juergen > I see a similar problem on two SMP boxes (is your SMP?), but in my case, it seems not to be ZFS related although I also use ZFS as /home filesystem From owner-freebsd-fs@FreeBSD.ORG Mon Jul 27 21:33:57 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9213106566C; Mon, 27 Jul 2009 21:33:57 +0000 (UTC) (envelope-from lists@jpru.de) Received: from jpru.ffm.jpru.de (jpru.ffm.jpru.de [195.49.136.33]) by mx1.freebsd.org (Postfix) with ESMTP id 5E6F18FC23; Mon, 27 Jul 2009 21:33:57 +0000 (UTC) (envelope-from lists@jpru.de) Received: from jpru.ffm.jpru.de (jpru.ffm.jpru.de [195.49.136.33]) by jpru.ffm.jpru.de (8.13.8/8.13.8) with ESMTP id n6RLXu6k066053; Mon, 27 Jul 2009 23:33:56 +0200 (CEST) (envelope-from lists@jpru.de) Received: (from unger@localhost) by jpru.ffm.jpru.de (8.13.8/8.13.8/Submit) id n6RLXu4L066052; Mon, 27 Jul 2009 23:33:56 +0200 (CEST) (envelope-from lists@jpru.de) X-Authentication-Warning: jpru.ffm.jpru.de: unger set sender to lists@jpru.de using -f Date: Mon, 27 Jul 2009 23:33:56 +0200 From: Juergen Unger To: "O. Hartmann" Message-ID: <20090727213355.GA37551@jpru.ffm.jpru.de> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4A6E06E6.9030300@mail.zedat.fu-berlin.de> User-Agent: Mutt/1.4.2.3i Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 21:33:58 -0000 Hi, On Mon, Jul 27, 2009 at 09:58:30PM +0200, O. Hartmann wrote: [...] > >> db> bt > >> Tracing pid 36 tid 100062 td 0x87166480 > >> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 > >> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 > >> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_znode_dmu_f3 > >> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebsd_reclaim+0 > >> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_APV+0xa5 > >> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 > >> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 > >> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 > >> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 > >> fork_trampoline() at fork_trampoline+0x8 > >> --- trap 0, eip = 0, esp = 0xfcd29d70, ebp = 0 --- > >> db> > > I see a similar problem on two SMP boxes (is your SMP?), but in my case, > it seems not to be ZFS related although I also use ZFS as /home filesystem no real SMP, its only an old P4 with hyperthreading: > CPU: Intel(R) Pentium(R) 4 CPU 3.20GHz (3192.02-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 > Features=0xbfebfbff > Features2=0x441d > AMD Features=0x100000 > TSC: P-state invariant > real memory = 4294967296 (4096 MB) > avail memory = 3392716800 (3235 MB) > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > FreeBSD/SMP: 1 package(s) x 1 core(s) x 2 HTT threads Juergen -- ENOSIG From owner-freebsd-fs@FreeBSD.ORG Tue Jul 28 06:43:09 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 82D75106564A for ; Tue, 28 Jul 2009 06:43:09 +0000 (UTC) (envelope-from bo.coopci@gmail.com) Received: from mail-px0-f196.google.com (mail-px0-f196.google.com [209.85.216.196]) by mx1.freebsd.org (Postfix) with ESMTP id 5847D8FC16 for ; Tue, 28 Jul 2009 06:43:09 +0000 (UTC) (envelope-from bo.coopci@gmail.com) Received: by pxi34 with SMTP id 34so1148469pxi.3 for ; Mon, 27 Jul 2009 23:43:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:content-type :content-transfer-encoding; bh=BuVNZbEbJL5XscPYqpU/giJa0AE6zI1qwxZnvVVUqDQ=; b=VK6LBfNemcnm6xEdCb3bj01TTIVLTC1SFsOWiluDEeqRbel3KTS1NYgCMKh+bVcxdj tZuc8F4WpsKAGQzD5s4QB+5kcZEvq6ooVQV3iw6Igi7b+E8Crx8OjbHmhF22Z+V7c0YE h8to4KtSAvwYieiYeGv78UWhiimoiNQhnQtXg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=ZV4PMvHauOp8e2+fNzr+Nmja7V+0bk+5fDg5jw8b3I161p5UFkfG4FIkQvhdJKh9Cy OaAAWO+gvSusy3oBhlgq1uhxxqMR/Wn9QGN+BAzSk6y9LQG4ENY2ve097na0YjbdIKxO 0MUZUzb5x+3wU877vKPD7JfqoULdO54iXPdc8= Received: by 10.114.180.20 with SMTP id c20mr11256026waf.160.1248761616166; Mon, 27 Jul 2009 23:13:36 -0700 (PDT) Received: from ?10.217.15.161? ([61.135.152.194]) by mx.google.com with ESMTPS id m25sm15319555waf.9.2009.07.27.23.13.34 (version=SSLv3 cipher=RC4-MD5); Mon, 27 Jul 2009 23:13:35 -0700 (PDT) Message-ID: <4A6E970A.1060503@gmail.com> Date: Tue, 28 Jul 2009 14:13:30 +0800 From: cooper User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 7bit Subject: About RAIDZ geometry X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2009 06:43:09 -0000 Hi folks, Now that ZFS has been ported to FreeBSD, I think it's proper to ask these questions in this mailing list. Is there any detailed documentation about RAIDZ geometry? I has been wandering how zfs remembers which disk sectors are for data and which ones are for parity. They claimed "Every block is its own RAID-Z stripe" does that mean zfs writes exactly one block in every vdev_raidz_io_start if the io type of the zio is write? And also claimed "You have to traverse the filesystem metadata to determine the RAID-Z geometry" where is this metadata stored? From owner-freebsd-fs@FreeBSD.ORG Tue Jul 28 10:03:07 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50D72106568D for ; Tue, 28 Jul 2009 10:03:07 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 818958FC12 for ; Tue, 28 Jul 2009 10:03:06 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA28274; Tue, 28 Jul 2009 12:50:27 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A6EC9E2.5070200@icyb.net.ua> Date: Tue, 28 Jul 2009 12:50:26 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: "O. Hartmann" References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> In-Reply-To: <4A6E06E6.9030300@mail.zedat.fu-berlin.de> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, Pawel Jakub Dawidek , freebsd-current@FreeBSD.org, spambox@haruhiism.net Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2009 10:03:07 -0000 on 27/07/2009 22:58 O. Hartmann said the following: > Juergen Unger wrote: [snip] >>> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 >>> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 >>> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_znode_dmu_f3 >>> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebsd_reclaim+0 >>> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_APV+0xa5 >>> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 >>> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 >>> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 >>> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 >>> fork_trampoline() at fork_trampoline+0x8 [snip] > > I see a similar problem on two SMP boxes (is your SMP?), but in my case, > it seems not to be ZFS related although I also use ZFS as /home filesystem In this case this does seem to be caused by ZFS. >From the backtrace we see that _sx_xlock() is called on bogus struct sx pointer (0x3c) and this is caused by dmu_buf_update_user() called with NULL first argument (dmu_buf_t). Which means that znode_t z_dbuf was NULL - this could have been caught by ASSERT in zfs_znode_dmu_fini if it were enabled. If you have the crash dump, then it would be interesting to examine znode_t structure ('zp' argument) in zfs_znode_dmu_fini. P.S. I see that zfs_inactive checks for z_dbuf being NULL and there is the following comment: /* * The fs has been unmounted, or we did a * suspend/resume and this file no longer exists. */ Maybe zfs_freebsd_reclaim should do the same? P.P.S. I am not a VFS or ZFS expert. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Tue Jul 28 13:50:45 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1D9E71065672; Tue, 28 Jul 2009 13:50:45 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id E5B648FC0A; Tue, 28 Jul 2009 13:50:44 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 9FC8346B0C; Tue, 28 Jul 2009 09:50:44 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 9A7288A0A4; Tue, 28 Jul 2009 09:50:43 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org, Rene Ladan Date: Tue, 28 Jul 2009 09:41:32 -0400 User-Agent: KMail/1.9.7 References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> In-Reply-To: <200907271400.n6RE05Rv056472@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200907280941.32840.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 28 Jul 2009 09:50:43 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2009 13:50:45 -0000 On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > The following reply was made to PR kern/136945; it has been noted by GNATS. > > From: Rene Ladan > To: John Baldwin > Cc: bug-followup@freebsd.org > Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) > Date: Mon, 27 Jul 2009 15:51:15 +0200 > > 2009/7/27 John Baldwin : > > I would actually expect this to be the correct order for these two locks.= > =A0Can > > you capture the output of the 'debug.witness.fullgraph' sysctl to a file? > > > Yes, see attachment. I'm still running the same 8.0-BETA2. Hmm, the attachment was eaten by a grue, can you post the file somewhere? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Tue Jul 28 14:35:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5669310656C0 for ; Tue, 28 Jul 2009 14:35:16 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: from mail-ew0-f213.google.com (mail-ew0-f213.google.com [209.85.219.213]) by mx1.freebsd.org (Postfix) with ESMTP id D97E78FC33 for ; Tue, 28 Jul 2009 14:35:15 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: by ewy9 with SMTP id 9so50944ewy.43 for ; Tue, 28 Jul 2009 07:35:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Nzyy/spqeS7YbMl3Nh30/Nk/HgSuzxD6jYdrOjt+EIo=; b=p18j+JS0lWDM7rG+UwTQslzAt415GaxoGSx4JiPqlnW9UowVgjhJbZU8ApBhFRlAJM 9nryXa+RRKYRffYbRGJfJbtFz2TK3WnznRp+QT/sol+SBUGaqkad8iVNbQCGFw2xse1u URThymvS9gYKQrEyaq3O86VRCIhBt7B/K/sEo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=o/ExXjdA+T/ZZop6tZoJXUubOe8qZgyR6yCskUiJy/1AwJw6+tMcrTcROShZ3PBEsx I0oiu+pCj5A4Qhdulf9yjM65v6aR8feX9gmeqPPQKBSHf47hGKPTi5lF4POXhxeaMNGS GK2fBqzabgwwZPm9ZXFJ2X1End40G/UnrSO0g= MIME-Version: 1.0 Received: by 10.216.16.212 with SMTP id h62mr1998461weh.201.1248789820281; Tue, 28 Jul 2009 07:03:40 -0700 (PDT) In-Reply-To: <200907280941.32840.jhb@freebsd.org> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907280941.32840.jhb@freebsd.org> Date: Tue, 28 Jul 2009 16:03:40 +0200 Message-ID: From: Rene Ladan To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2009 14:35:17 -0000 2009/7/28 John Baldwin : > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: >> The following reply was made to PR kern/136945; it has been noted by GNA= TS. >> >> From: Rene Ladan >> To: John Baldwin >> Cc: bug-followup@freebsd.org >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) >> Date: Mon, 27 Jul 2009 15:51:15 +0200 >> >> =A02009/7/27 John Baldwin : >> =A0> I would actually expect this to be the correct order for these two > locks.=3D >> =A0 =3DA0Can >> =A0> you capture the output of the 'debug.witness.fullgraph' sysctl to a= file? >> =A0> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2. > > Hmm, the attachment was eaten by a grue, can you post the file somewhere? > Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt Ren=E9 From owner-freebsd-fs@FreeBSD.ORG Tue Jul 28 14:38:54 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 530DF1065678 for ; Tue, 28 Jul 2009 14:38:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2502A8FC13 for ; Tue, 28 Jul 2009 14:38:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id D1D8846B0C; Tue, 28 Jul 2009 10:38:53 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 307268A0A2; Tue, 28 Jul 2009 10:38:53 -0400 (EDT) From: John Baldwin To: Rene Ladan Date: Tue, 28 Jul 2009 10:38:29 -0400 User-Agent: KMail/1.9.7 References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907280941.32840.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200907281038.30277.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 28 Jul 2009 10:38:53 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2009 14:38:54 -0000 On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: > 2009/7/28 John Baldwin : > > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > >> The following reply was made to PR kern/136945; it has been noted by=20 GNATS. > >> > >> From: Rene Ladan > >> To: John Baldwin > >> Cc: bug-followup@freebsd.org > >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) > >> Date: Mon, 27 Jul 2009 15:51:15 +0200 > >> > >> =A02009/7/27 John Baldwin : > >> =A0> I would actually expect this to be the correct order for these two > > locks.=3D > >> =A0 =3DA0Can > >> =A0> you capture the output of the 'debug.witness.fullgraph' sysctl to= a=20 file? > >> =A0> > >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2. > > > > Hmm, the attachment was eaten by a grue, can you post the file somewher= e? > > > Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt Ok, it looks like it did encounter a UFS -> filedesc order at some point. = Can=20 you patch sys/kern/subr_witness.c to add a section to the order_lists[] arr= ay=20 after the 'ZFS locking list' and before the spin locks list that looks like= =20 this: { "filedesc structure", &lock_class_sx }, { "ufs", &lock_class_lockmgr}, { NULL, NULL }, =2D-=20 John Baldwin From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 05:49:12 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 31189106568F for ; Wed, 29 Jul 2009 05:49:12 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id EF8E98FC1A for ; Wed, 29 Jul 2009 05:49:11 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n6T5nASs001090 for ; Wed, 29 Jul 2009 00:49:10 -0500 (CDT) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: content-type:content-transfer-encoding; b=atJdkXmRT9kqmQZKa8kdBZQy/YZCDvMjJzBCnlgmKRdik05gr20EvdaR4RbvFq0mB y0GNcQdzn4RF/6/WyB5Lx08760gEqAvTxhpBqHHyJF/LTOqjMASgzt0u1WrYvo7D4yj 87UqiosLLsifaEl27sU5+3X7kSCMFekx1xC9kPc= Message-ID: <4A6FE2D6.7050300@jrv.org> Date: Wed, 29 Jul 2009 00:49:10 -0500 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.22 (Macintosh/20090605) MIME-Version: 1.0 To: freebsd-fs Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: [ZFS] umount at reboot crashes X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 05:49:12 -0000 FreeBSD bigback.housenet.jrv 8.0-BETA2 FreeBSD 8.0-BETA2 #0 r195757M: Mon Jul 20 10:27:28 CDT 2009 james@bigback.housenet.jrv:/usr/obj/usr/src/sys/BIGTEX amd64 I have a system that almost always crashes whenever it receives a ZFS replication package ("zfs recv") that either deleted or renames filesystems, both operations requiring unmounting. Sometimes it crashes later in the "zfs recv", sometimes not until I reboot that system. The sx_xlock() in frame 10 seems a common theme in these crashes. The dump is available. #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:223 #1 0xffffffff801dfdec in db_fncall (dummy1=Variable "dummy1" is not available. ) at /usr/src/sys/ddb/db_command.c:548 #2 0xffffffff801e0121 in db_command (last_cmdp=0xffffffff80bbd9e0, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #3 0xffffffff801e0370 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xffffffff801e2349 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff805bab85 in kdb_trap (type=12, code=0, tf=0xffffff810f20a690) at /usr/src/sys/kern/subr_kdb.c:534 #6 0xffffffff8083cf7d in trap_fatal (frame=0xffffff810f20a690, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:847 #7 0xffffffff8083d2ed in trap_pfault (frame=0xffffff810f20a690, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:768 #8 0xffffffff8083dce3 in trap (frame=0xffffff810f20a690) at /usr/src/sys/amd64/amd64/trap.c:494 #9 0xffffffff80823883 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #10 0xffffffff80592e4a in _sx_xlock (sx=0x58, opts=0, file=0xffffffff810b4d68 "/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c", line=1807) at /usr/src/sys/kern/kern_sx.c:284 #11 0xffffffff80ffa9d7 in dmu_buf_update_user (db_fake=0x0, old_user_ptr=0xffffff0148924468, user_ptr=0x0, user_data_ptr_ptr=0x0, evict_func=0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1807 #12 0xffffffff810401e8 in zfs_znode_dmu_fini (zp=0xffffff0148924468) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:551 #13 0xffffffff8105fcee in zfs_freebsd_reclaim (ap=Variable "ap" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4381 #14 0xffffffff8061ae05 in vgonel (vp=0xffffff014893f3b0) at vnode_if.h:830 #15 0xffffffff8061e975 in vflush (mp=0xffffff01468355e0, rootrefs=0, flags=0, td=0xffffff01468a3000) at /usr/src/sys/kern/vfs_subr.c:2449 #16 0xffffffff8105a598 in zfs_umount (vfsp=0xffffff01468355e0, fflag=524288) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:996 #17 0xffffffff80616336 in dounmount (mp=0xffffff01468355e0, flags=524288, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1289 #18 0xffffffff8061be54 in vfs_unmountall () at /usr/src/sys/kern/vfs_subr.c:3141 #19 0xffffffff8058b58f in boot (howto=0) at /usr/src/sys/kern/kern_shutdown.c:401 #20 0xffffffff8058b8b8 in reboot (td=Variable "td" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:173 #21 0xffffffff8083d4af in syscall (frame=0xffffff810f20ac80) at /usr/src/sys/amd64/amd64/trap.c:984 #22 0xffffffff80823b61 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #23 0x000000080078f96c in ?? () Previous frame inner to this frame (corrupt stack?) From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 08:40:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F7D81065689 for ; Wed, 29 Jul 2009 08:40:48 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 262238FC18 for ; Wed, 29 Jul 2009 08:40:47 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id E087445C9B; Wed, 29 Jul 2009 10:18:46 +0200 (CEST) Received: from localhost (pjd-w.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id C05B745C8A; Wed, 29 Jul 2009 10:18:34 +0200 (CEST) Date: Wed, 29 Jul 2009 10:19:00 +0200 From: Pawel Jakub Dawidek To: cooper Message-ID: <20090729081900.GC1586@garage.freebsd.pl> References: <4A6E970A.1060503@gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="nFreZHaLTZJo0R7j" Content-Disposition: inline In-Reply-To: <4A6E970A.1060503@gmail.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: About RAIDZ geometry X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 08:40:48 -0000 --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 28, 2009 at 02:13:30PM +0800, cooper wrote: > Hi folks, >=20 > Now that ZFS has been ported to FreeBSD, I think it's proper to ask > these questions in this mailing list. >=20 > Is there any detailed documentation about RAIDZ geometry? I has been > wandering how zfs remembers which disk sectors are for data and which > ones are for parity. >=20 > They claimed "Every block is its own RAID-Z stripe" does that mean zfs > writes exactly one block in every vdev_raidz_io_start if the io type of > the zio is write? >=20 > And also claimed "You have to traverse the filesystem metadata to > determine the RAID-Z geometry" where is this metadata stored? If you have file system separated from volume manager doing RAID, the volume manager has no knowledge where the live data is, etc. Imagine a ZFS 5-disks RAIDZ1 configuration: disk0 disk1 disk2 disk3 disk4 ZFS' file system level wants to store two sectors of data so it sends 1kB of data to lower ZFS layers. You then get something like this: disk0 disk1 disk2 disk3 disk4 data0 data1 parity01 Now ZFS wants to store one sector and you get this: disk0 disk1 disk2 disk3 disk4 data0 data1 parity01 data2 parity2 ZFS can do that, because he knows where and how to read the data back (its a file system and volume manager). ZFS will never try to read three sectors as one block of data. This is impossible to do when file system is separated from volume manager. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --nFreZHaLTZJo0R7j Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKcAX0ForvXbEpPzQRAoRMAJ0arT2tov9eIAdAXWCsAkmCvEAn8ACZAfEQ s3m4ujNZFoXS9vTiJ8ylK08= =VROf -----END PGP SIGNATURE----- --nFreZHaLTZJo0R7j-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 08:47:18 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 748071065670; Wed, 29 Jul 2009 08:47:18 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 2085D8FC12; Wed, 29 Jul 2009 08:47:16 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 3A50845C8C; Wed, 29 Jul 2009 10:47:11 +0200 (CEST) Received: from localhost (pjd.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id E8A4045683; Wed, 29 Jul 2009 10:46:57 +0200 (CEST) Date: Wed, 29 Jul 2009 10:47:23 +0200 From: Pawel Jakub Dawidek To: Andriy Gapon Message-ID: <20090729084723.GD1586@garage.freebsd.pl> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8GpibOaaTibBMecb" Content-Disposition: inline In-Reply-To: <4A6EC9E2.5070200@icyb.net.ua> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@FreeBSD.org, "O. Hartmann" , freebsd-current@FreeBSD.org, spambox@haruhiism.net Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 08:47:18 -0000 --8GpibOaaTibBMecb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 28, 2009 at 12:50:26PM +0300, Andriy Gapon wrote: > on 27/07/2009 22:58 O. Hartmann said the following: > > Juergen Unger wrote: > [snip] > >>> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 > >>> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 > >>> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_z= node_dmu_f3 > >>> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebs= d_reclaim+0 > >>> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_AP= V+0xa5 > >>> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 > >>> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 > >>> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 > >>> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 > >>> fork_trampoline() at fork_trampoline+0x8 > [snip] > >=20 > > I see a similar problem on two SMP boxes (is your SMP?), but in my case, > > it seems not to be ZFS related although I also use ZFS as /home filesys= tem >=20 > In this case this does seem to be caused by ZFS. > >From the backtrace we see that _sx_xlock() is called on bogus struct sx = pointer > (0x3c) and this is caused by dmu_buf_update_user() called with NULL first= argument > (dmu_buf_t). Which means that znode_t z_dbuf was NULL - this could have b= een > caught by ASSERT in zfs_znode_dmu_fini if it were enabled. >=20 > If you have the crash dump, then it would be interesting to examine znode= _t > structure ('zp' argument) in zfs_znode_dmu_fini. >=20 > P.S. I see that zfs_inactive checks for z_dbuf being NULL and there is the > following comment: > /* > * The fs has been unmounted, or we did a > * suspend/resume and this file no longer exists. > */ > Maybe zfs_freebsd_reclaim should do the same? Yes, you might be right. Could you guys, who can reproduce it, try this patch: http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --8GpibOaaTibBMecb Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKcAybForvXbEpPzQRAr1LAJ4hh/MnGNtjpDIj53DAz9C6pRYm9QCfcwlj o4+rX0e6DP6AhsOI90IrHKw= =/BOp -----END PGP SIGNATURE----- --8GpibOaaTibBMecb-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 09:20:02 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B8C3D1065672 for ; Wed, 29 Jul 2009 09:20:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 88DC58FC14 for ; Wed, 29 Jul 2009 09:20:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6T9K2o7025797 for ; Wed, 29 Jul 2009 09:20:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6T9K2TZ025796; Wed, 29 Jul 2009 09:20:02 GMT (envelope-from gnats) Date: Wed, 29 Jul 2009 09:20:02 GMT Message-Id: <200907290920.n6T9K2TZ025796@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Marc Olzheim Cc: Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Marc Olzheim List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 09:20:03 -0000 The following reply was made to PR kern/127213; it has been noted by GNATS. From: Marc Olzheim To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption Date: Wed, 29 Jul 2009 10:51:40 +0200 This is not just a bug, but a security hole. What you are seeing is the content of other VM pages. This has exactly the same symptoms as http://security.freebsd.org/advisories/FreeBSD-SA-05:02.sendfile.asc In short: do not use tmpfs unless you trust all users on the system with root access. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 09:52:25 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE1B11065679; Wed, 29 Jul 2009 09:52:25 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: from mail-ew0-f213.google.com (mail-ew0-f213.google.com [209.85.219.213]) by mx1.freebsd.org (Postfix) with ESMTP id 1ECF58FC0A; Wed, 29 Jul 2009 09:52:24 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: by ewy9 with SMTP id 9so542417ewy.43 for ; Wed, 29 Jul 2009 02:52:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=qew2moPiidwNvCxOfP0vo+blsbr56CIu+mnDvAfNh5Y=; b=Omo3txuLoVc1xXCbuExblrA/kAaWwTJkI/7v7nJ4fTJYVaD1kAkreDGwwy/3rj/r3k Q/q14tfTQs0SpMVXuXssIUGMkmcTWKKBqY45WbPuO41/0hFYDeDcf9G66Do/6q56XD5S Mg3UYYfb26KetFw6ut0olDQ84oRZUVGdoiM/U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=tF/LtHaWN0dr4FxQIYtpegAXlBVZRJEnl+u27HXZ22TyYvGlKoVWcHUiYyS8ZoxIa7 oTqMxlNcjJtVYjwJNGU0ZB2ZYGnCshe/7w1CdO/idQo5NrCzHxiD0b+NTbiKUVmAYy3w 1KoBzIzdSvjrvQ5IkZhw24Ddlnw3u0Mxc9uxA= MIME-Version: 1.0 Sender: r.c.ladan@gmail.com Received: by 10.216.25.144 with SMTP id z16mr2291079wez.179.1248861144056; Wed, 29 Jul 2009 02:52:24 -0700 (PDT) In-Reply-To: <200907281038.30277.jhb@freebsd.org> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907280941.32840.jhb@freebsd.org> <200907281038.30277.jhb@freebsd.org> Date: Wed, 29 Jul 2009 11:52:24 +0200 X-Google-Sender-Auth: 9a9dc308b3ac5ac7 Message-ID: From: Rene Ladan To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 09:52:26 -0000 2009/7/28 John Baldwin : > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: >> 2009/7/28 John Baldwin : >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: >> >> The following reply was made to PR kern/136945; it has been noted by > GNATS. >> >> >> >> From: Rene Ladan >> >> To: John Baldwin >> >> Cc: bug-followup@freebsd.org >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 >> >> >> >> =A02009/7/27 John Baldwin : >> >> =A0> I would actually expect this to be the correct order for these t= wo >> > locks.=3D >> >> =A0 =3DA0Can >> >> =A0> you capture the output of the 'debug.witness.fullgraph' sysctl t= o a > file? >> >> =A0> >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2. >> > >> > Hmm, the attachment was eaten by a grue, can you post the file somewhe= re? >> > >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt > > Ok, it looks like it did encounter a UFS -> filedesc order at some point.= =A0Can > you patch sys/kern/subr_witness.c to add a section to the order_lists[] a= rray > after the 'ZFS locking list' and before the spin locks list that looks li= ke > this: > > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, > =A0 =A0 =A0 =A0{ NULL, NULL }, > The LOR seems to be gone, previously it showed up only once right after booting the system. But now a new LOR (according to the LOR page) seems pop up: Trying to mount root from ufs:/dev/ad0s1a lock order reversal: 1st 0xffffff0002a4ad80 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 2nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ /usr/src/sys/kern/kern_descrip.c:2478 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _sx_xlock() at _sx_xlock+0x44 mountcheckdirs() at mountcheckdirs+0x80 vfs_donmount() at vfs_donmount+0xfbf kernel_mount() at kernel_mount+0xa1 vfs_mountroot_try() at vfs_mountroot_try+0x177 vfs_mountroot() at vfs_mountroot+0x47d start_init() at start_init+0x62 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- The output of `df' and `mount' looks ok. Ren=E9 From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 10:32:29 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB10A106564A; Wed, 29 Jul 2009 10:32:29 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 674788FC26; Wed, 29 Jul 2009 10:32:29 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:57600 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MW6SK-0004um-4k; Wed, 29 Jul 2009 12:32:22 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 66940117D82; Wed, 29 Jul 2009 12:32:15 +0200 (CEST) Message-Id: From: Thomas Backman To: Pawel Jakub Dawidek In-Reply-To: <20090729084723.GD1586@garage.freebsd.pl> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 12:32:13 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MW6SK-0004um-4k. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MW6SK-0004um-4k 502cb08cd2106d43743912be71b453ea Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 10:32:30 -0000 On Jul 29, 2009, at 10:47, Pawel Jakub Dawidek wrote: > On Tue, Jul 28, 2009 at 12:50:26PM +0300, Andriy Gapon wrote: >> on 27/07/2009 22:58 O. Hartmann said the following: >>> Juergen Unger wrote: >> [snip] >>>>> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 >>>>> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user >>>>> +0x35 >>>>> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at >>>>> zfs_znode_dmu_f3 >>>>> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at >>>>> zfs_freebsd_reclaim+0 >>>>> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at >>>>> VOP_RECLAIM_APV+0xa5 >>>>> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 >>>>> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 >>>>> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc >>>>> +0x80 >>>>> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 >>>>> fork_trampoline() at fork_trampoline+0x8 >> [snip] >>> >>> I see a similar problem on two SMP boxes (is your SMP?), but in my >>> case, >>> it seems not to be ZFS related although I also use ZFS as /home >>> filesystem >> >> In this case this does seem to be caused by ZFS. >>> From the backtrace we see that _sx_xlock() is called on bogus >>> struct sx pointer >> (0x3c) and this is caused by dmu_buf_update_user() called with NULL >> first argument >> (dmu_buf_t). Which means that znode_t z_dbuf was NULL - this could >> have been >> caught by ASSERT in zfs_znode_dmu_fini if it were enabled. >> >> If you have the crash dump, then it would be interesting to examine >> znode_t >> structure ('zp' argument) in zfs_znode_dmu_fini. >> >> P.S. I see that zfs_inactive checks for z_dbuf being NULL and there >> is the >> following comment: >> /* >> * The fs has been unmounted, or we did a >> * suspend/resume and this file no longer exists. >> */ >> Maybe zfs_freebsd_reclaim should do the same? > > Yes, you might be right. > > Could you guys, who can reproduce it, try this patch: > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch OFF TOPIC: Due to similarities in the backtrace between this and a panic I've been seeing on exporting after zfs recv (see http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009105.html and also http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html for a panics-every-time script) I tried this patch. Unfortunately, I still get the same panic (from vgonel() and up, it's the same, except for my typo in the linked email.) Just thought I should point it out. Except for temporary storage problems when moving my data to ZFS, this panic is the last hurdle in not using FreeBSD for me. :/ Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 11:21:31 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6956A106564A; Wed, 29 Jul 2009 11:21:31 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 434808FC13; Wed, 29 Jul 2009 11:21:29 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA29882; Wed, 29 Jul 2009 14:21:26 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A7030B6.8010205@icyb.net.ua> Date: Wed, 29 Jul 2009 14:21:26 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> In-Reply-To: X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 11:21:31 -0000 on 29/07/2009 13:32 Thomas Backman said the following: > OFF TOPIC: > Due to similarities in the backtrace between this and a panic I've been > seeing on exporting after zfs recv (see > http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009105.html and > also > http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html for > a panics-every-time script) I tried this patch. Unfortunately, I still > get the same panic (from vgonel() and up, it's the same, except for my > typo in the linked email.) Your panics are superficially similar but seem to be different. But it is hard to tell as function argument values are not available in your backtraces for the interesting calls. One difference that I see is that your panics happen one level below _sx_xlock, in sx_xlock_hard and sx argument value appears to be far from NULL (0xffffff0043557d50) - in the panic that started this thread it was near NULL. Another difference is that you panics do not involve zfs_znode_dmu_fini and mu_buf_update_user, in your case sx_xlock is called directly from zfs_freebsd_reclaim. So it must a problem with a different lock. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 11:42:21 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F238A106566C; Wed, 29 Jul 2009 11:42:20 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id CD8E78FC14; Wed, 29 Jul 2009 11:42:19 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA00438; Wed, 29 Jul 2009 14:42:17 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A703598.8080809@icyb.net.ua> Date: Wed, 29 Jul 2009 14:42:16 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> In-Reply-To: <4A7030B6.8010205@icyb.net.ua> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 11:42:21 -0000 on 29/07/2009 14:21 Andriy Gapon said the following: > Your panics are superficially similar but seem to be different. > But it is hard to tell as function argument values are not available in your > backtraces for the interesting calls. > One difference that I see is that your panics happen one level below _sx_xlock, in > sx_xlock_hard and sx argument value appears to be far from NULL > (0xffffff0043557d50) - in the panic that started this thread it was near NULL. > Another difference is that you panics do not involve zfs_znode_dmu_fini and > mu_buf_update_user, in your case sx_xlock is called directly from > zfs_freebsd_reclaim. So it must a problem with a different lock. BTW, have you tried to reproduce the problem with INVARIANTS enabled? Do you have crashdumps with debugging symbols? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 12:25:23 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9629106564A for ; Wed, 29 Jul 2009 12:25:23 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 1C4108FC08 for ; Wed, 29 Jul 2009 12:25:22 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA01278; Wed, 29 Jul 2009 15:25:19 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A703FAF.1090109@icyb.net.ua> Date: Wed, 29 Jul 2009 15:25:19 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: "James R. Van Artsdalen" References: <4A6FE2D6.7050300@jrv.org> In-Reply-To: <4A6FE2D6.7050300@jrv.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs Subject: Re: [ZFS] umount at reboot crashes X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 12:25:24 -0000 on 29/07/2009 08:49 James R. Van Artsdalen said the following: > FreeBSD bigback.housenet.jrv 8.0-BETA2 FreeBSD 8.0-BETA2 #0 r195757M: > Mon Jul 20 10:27:28 CDT 2009 > james@bigback.housenet.jrv:/usr/obj/usr/src/sys/BIGTEX amd64 > > I have a system that almost always crashes whenever it receives a ZFS > replication package ("zfs recv") that either deleted or renames > filesystems, both operations requiring unmounting. Sometimes it crashes > later in the "zfs recv", sometimes not until I reboot that system. > > The sx_xlock() in frame 10 seems a common theme in these crashes. Could you please try a patch posted by Pawel in a parallel thread with subject "zfs: Fatal trap 12: page fault while in kernel mode"? Please followup with your results and CC Pawel. Thank you! -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 12:46:55 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3BCF71065672; Wed, 29 Jul 2009 12:46:55 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id B641A8FC13; Wed, 29 Jul 2009 12:46:54 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:41094 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MW8XN-0002Xt-5h; Wed, 29 Jul 2009 14:45:43 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id E15613964; Wed, 29 Jul 2009 14:45:41 +0200 (CEST) Message-Id: <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A7030B6.8010205@icyb.net.ua> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 14:45:39 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MW8XN-0002Xt-5h. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MW8XN-0002Xt-5h 8d6ca29e893b9606c24113d6c26cb6e4 Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 12:46:55 -0000 On Jul 29, 2009, at 13:21, Andriy Gapon wrote: > on 29/07/2009 13:32 Thomas Backman said the following: >> OFF TOPIC: >> Due to similarities in the backtrace between this and a panic I've >> been >> seeing on exporting after zfs recv (see >> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009105.html >> and >> also >> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html >> for >> a panics-every-time script) I tried this patch. Unfortunately, I >> still >> get the same panic (from vgonel() and up, it's the same, except for >> my >> typo in the linked email.) > > Your panics are superficially similar but seem to be different. > But it is hard to tell as function argument values are not available > in your > backtraces for the interesting calls. > One difference that I see is that your panics happen one level below > _sx_xlock, in > sx_xlock_hard and sx argument value appears to be far from NULL > (0xffffff0043557d50) - in the panic that started this thread it was > near NULL. > Another difference is that you panics do not involve > zfs_znode_dmu_fini and > mu_buf_update_user, in your case sx_xlock is called directly from > zfs_freebsd_reclaim. So it must a problem with a different lock. > > -- > Andriy Gapon The DDB output from one panic does involve zfs_znode_dmu_fini and dmu_buf_update_user: _sx_xlock() dmu_buf_update_user()+0x47 zfs_znode_dmu_fini() zfs_freebsd_reclaim() VOP_RECLAIM_APV() vgonel() vflush() zfs_umount() dounmount() unmount() syscall() Xfast_syscall() (Sorry if the formatting got screwed up above.) > BTW, have you tried to reproduce the problem with INVARIANTS enabled? > Do you have crashdumps with debugging symbols? I tried again with INVARIANTS, but see no difference in the panic, the DDB bt or the KGDB bt. What does invariants really do? (Not sure how to use it to my advantage here :) Re: debugging symbols; isn't that the default? I do have a .symbols file for all the files in /boot/kernel, but that's all I know to be honest. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 13:03:45 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7EF71106564A; Wed, 29 Jul 2009 13:03:45 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 51F878FC17; Wed, 29 Jul 2009 13:03:43 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA02442; Wed, 29 Jul 2009 16:03:40 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A7048A9.4020507@icyb.net.ua> Date: Wed, 29 Jul 2009 16:03:37 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> In-Reply-To: <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 13:03:46 -0000 on 29/07/2009 15:45 Thomas Backman said the following: > On Jul 29, 2009, at 13:21, Andriy Gapon wrote: > >> on 29/07/2009 13:32 Thomas Backman said the following: >>> OFF TOPIC: >>> Due to similarities in the backtrace between this and a panic I've been >>> seeing on exporting after zfs recv (see >>> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009105.html and >>> >>> also >>> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html for >>> [snip] > The DDB output from one panic does involve zfs_znode_dmu_fini and > dmu_buf_update_user: > _sx_xlock() > dmu_buf_update_user()+0x47 > zfs_znode_dmu_fini() > zfs_freebsd_reclaim() > VOP_RECLAIM_APV() > vgonel() > vflush() > zfs_umount() > dounmount() > unmount() > syscall() > Xfast_syscall() > (Sorry if the formatting got screwed up above.) Hmm, then you experienced two different kinds of panics. To quote the link you posted earlier: [1] http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html ... #10 0xffffffff8086e007 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:223 #11 0xffffffff805a4989 in _sx_xlock_hard (sx=0xffffff0043557d50, tid=18446742975830720512, opts=Variable "opts" is not available.) at /usr/src/sys/kern/kern_sx.c:575 #12 0xffffffff805a52fe in _sx_xlock (sx=Variable "sx" is not available.) at sx.h:155 #13 0xffffffff80fe2995 in zfs_freebsd_reclaim () from /boot/kernel/zfs.ko ... So now that you said that the patch didn't fix the problem for you, could you please clarify what panic you do see after applying it? >> BTW, have you tried to reproduce the problem with INVARIANTS enabled? >> Do you have crashdumps with debugging symbols? > > I tried again with INVARIANTS, but see no difference in the panic, the > DDB bt or the KGDB bt. What does invariants really do? (Not sure how to > use it to my advantage here :) INVARIANTS enables KASSERTs in various parts of code which can help to catch bugs earlier that may result in cryptic panics afterwards. > Re: debugging symbols; isn't that the default? I do have a .symbols file > for all the files in /boot/kernel, but that's all I know to be honest. Ok, then if you get a crash dump, you are able use kgdb and it will be able to produce line numbers and will allow to examine variables. P.S. sorry if I miss context of your previous reports. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 13:24:33 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 085321065673; Wed, 29 Jul 2009 13:24:33 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 82AC28FC1A; Wed, 29 Jul 2009 13:24:32 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:58148 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MW98f-0002x9-5H; Wed, 29 Jul 2009 15:24:15 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id BECE72ADC; Wed, 29 Jul 2009 15:24:14 +0200 (CEST) Message-Id: <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A7048A9.4020507@icyb.net.ua> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 15:24:12 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MW98f-0002x9-5H. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MW98f-0002x9-5H d4310a4baa18bf7758861e6aaf86cc67 Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 13:24:33 -0000 On Jul 29, 2009, at 15:03, Andriy Gapon wrote: > on 29/07/2009 15:45 Thomas Backman said the following: >> On Jul 29, 2009, at 13:21, Andriy Gapon wrote: >> >>> on 29/07/2009 13:32 Thomas Backman said the following: >>>> OFF TOPIC: >>>> Due to similarities in the backtrace between this and a panic >>>> I've been >>>> seeing on exporting after zfs recv (see >>>> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009105.html >>>> and >>>> >>>> also >>>> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html >>>> for >>>> > [snip] >> The DDB output from one panic does involve zfs_znode_dmu_fini and >> dmu_buf_update_user: >> _sx_xlock() >> dmu_buf_update_user()+0x47 >> zfs_znode_dmu_fini() >> zfs_freebsd_reclaim() >> VOP_RECLAIM_APV() >> vgonel() >> vflush() >> zfs_umount() >> dounmount() >> unmount() >> syscall() >> Xfast_syscall() >> (Sorry if the formatting got screwed up above.) > > Hmm, then you experienced two different kinds of panics. > To quote the link you posted earlier: > [1] http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html > ... > #10 0xffffffff8086e007 in calltrap () at /usr/src/sys/amd64/amd64/ > exception.S:223 > #11 0xffffffff805a4989 in _sx_xlock_hard (sx=0xffffff0043557d50, > tid=18446742975830720512, opts=Variable "opts" is not available.) at > /usr/src/sys/kern/kern_sx.c:575 > #12 0xffffffff805a52fe in _sx_xlock (sx=Variable "sx" is not > available.) at sx.h:155 > #13 0xffffffff80fe2995 in zfs_freebsd_reclaim () from /boot/kernel/ > zfs.ko > ... > > So now that you said that the patch didn't fix the problem for you, > could you > please clarify what panic you do see after applying it? > [...] > > Ok, then if you get a crash dump, you are able use kgdb and it will > be able to > produce line numbers and will allow to examine variables. > > P.S. sorry if I miss context of your previous reports. > > -- > Andriy Gapon Hmm, you are indeed right, it's not the same panic. The backtrace I got just now with INVARIANTS is the one you quoted above. I still get the "_sx_xlock (sx=Variable "sx" is not available.)" and "_sx_xlock_hard (sx=0xffffff00090d5018, ..., opts=Variable "opts" is not available.)" though. Am I missing some option (I've got GENERIC, minus WITNESS plus DTRACE, now that INVARIANTS is back in place), or does this "just happen"? Here's the "full" backtrace (minus the panic(), trap() etc.): #10 0xffffffff8057dfe7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff00090d5018, tid=18446742974952890368, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:575 #12 0xffffffff8034350e in _sx_xlock (sx=Variable "sx" is not available. ) at sx.h:155 #13 0xffffffff80af7596 in zfs_freebsd_reclaim () from /boot/kernel/ zfs.ko #14 0xffffffff805c5c2a in VOP_RECLAIM_APV (vop=0xffffff00090d5018, a=0xffffff00090d5000) at vnode_if.c:1926 #15 0xffffffff803c839e in vgonel (vp=0xffffff0009252588) at vnode_if.h: 830 #16 0xffffffff803cc958 in vflush (mp=0xffffff0002cd7bc0, rootrefs=0, flags=0, td=0xffffff002cffe000) at /usr/src/sys/kern/vfs_subr.c:2449 #17 0xffffffff80af2038 in zfs_umount () from /boot/kernel/zfs.ko #18 0xffffffff803c55ca in dounmount (mp=0xffffff0002cd7bc0, flags=47020992, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1289 #19 0xffffffff803c5df8 in unmount (td=0xffffff002cffe000, uap=0xffffff803e98bbf0) at /usr/src/sys/kern/vfs_mount.c:1174 #20 0xffffffff805980bf in syscall (frame=0xffffff803e98bc80) at /usr/src/sys/amd64/amd64/trap.c:984 #21 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #22 0x000000080104e9ec in ?? () Previous frame inner to this frame (corrupt stack?) This happens on "zpool export" on the receiving pool (never on the sending pool) when running the script in the posts above. (Which, I realize, few people will have run.) I also get another panic when manually doing zfs unmount on the root FS on the pool, rather than exporting it: http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009209.html Now we're drifting way off topic, though. Sorry. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 13:46:06 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FD9C1065674; Wed, 29 Jul 2009 13:46:06 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 4B5808FC1E; Wed, 29 Jul 2009 13:46:04 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA03430; Wed, 29 Jul 2009 16:46:01 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A705299.8060504@icyb.net.ua> Date: Wed, 29 Jul 2009 16:46:01 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> In-Reply-To: <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 13:46:07 -0000 on 29/07/2009 16:24 Thomas Backman said the following: > Hmm, you are indeed right, it's not the same panic. The backtrace I got > just now with INVARIANTS is the one you quoted above. > I still get the "_sx_xlock (sx=Variable "sx" is not available.)" and > "_sx_xlock_hard (sx=0xffffff00090d5018, ..., opts=Variable "opts" is not > available.)" though. > Am I missing some option (I've got GENERIC, minus WITNESS plus DTRACE, > now that INVARIANTS is back in place), or does this "just happen"? Not sure what this question is about. What option, what "this" :-) > Here's the "full" backtrace (minus the panic(), trap() etc.): > > #10 0xffffffff8057dfe7 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff00090d5018, > tid=18446742974952890368, opts=Variable "opts" is not available. > ) at /usr/src/sys/kern/kern_sx.c:575 > #12 0xffffffff8034350e in _sx_xlock (sx=Variable "sx" is not available. > ) at sx.h:155 > #13 0xffffffff80af7596 in zfs_freebsd_reclaim () from /boot/kernel/zfs.ko > #14 0xffffffff805c5c2a in VOP_RECLAIM_APV (vop=0xffffff00090d5018, > a=0xffffff00090d5000) at vnode_if.c:1926 > #15 0xffffffff803c839e in vgonel (vp=0xffffff0009252588) at vnode_if.h:830 > #16 0xffffffff803cc958 in vflush (mp=0xffffff0002cd7bc0, rootrefs=0, > flags=0, > td=0xffffff002cffe000) at /usr/src/sys/kern/vfs_subr.c:2449 > #17 0xffffffff80af2038 in zfs_umount () from /boot/kernel/zfs.ko > #18 0xffffffff803c55ca in dounmount (mp=0xffffff0002cd7bc0, flags=47020992, > td=Variable "td" is not available. > ) at /usr/src/sys/kern/vfs_mount.c:1289 > #19 0xffffffff803c5df8 in unmount (td=0xffffff002cffe000, > uap=0xffffff803e98bbf0) at /usr/src/sys/kern/vfs_mount.c:1174 > #20 0xffffffff805980bf in syscall (frame=0xffffff803e98bc80) > at /usr/src/sys/amd64/amd64/trap.c:984 > #21 0xffffffff8057e2c1 in Xfast_syscall () > at /usr/src/sys/amd64/amd64/exception.S:373 > #22 0x000000080104e9ec in ?? () > Previous frame inner to this frame (corrupt stack?) Looks like your zfs module is built without debugging symbols? Maybe because was it built/rebuilt individually, not as part of kernel build? It would be useful to get line number in frame 13 and examine sx object in frame 11, esp. sx_lock field. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 13:52:44 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D857A1065693; Wed, 29 Jul 2009 13:52:43 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 5BDA88FC24; Wed, 29 Jul 2009 13:52:43 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:46627 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MW9a1-0001Fa-49; Wed, 29 Jul 2009 15:52:31 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id B68C9163F72; Wed, 29 Jul 2009 15:52:29 +0200 (CEST) Message-Id: From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A705299.8060504@icyb.net.ua> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 15:52:27 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MW9a1-0001Fa-49. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MW9a1-0001Fa-49 97b855054291372e8d10e59664a6bd61 Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 13:52:45 -0000 On Jul 29, 2009, at 15:46, Andriy Gapon wrote: > on 29/07/2009 16:24 Thomas Backman said the following: >> Hmm, you are indeed right, it's not the same panic. The backtrace I >> got >> just now with INVARIANTS is the one you quoted above. >> I still get the "_sx_xlock (sx=Variable "sx" is not available.)" and >> "_sx_xlock_hard (sx=0xffffff00090d5018, ..., opts=Variable "opts" >> is not >> available.)" though. >> Am I missing some option (I've got GENERIC, minus WITNESS plus >> DTRACE, >> now that INVARIANTS is back in place), or does this "just happen"? > > Not sure what this question is about. What option, what "this" :-) > >> Here's the "full" backtrace (minus the panic(), trap() etc.): >> >> #10 0xffffffff8057dfe7 in calltrap () >> at /usr/src/sys/amd64/amd64/exception.S:224 >> #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff00090d5018, >> tid=18446742974952890368, opts=Variable "opts" is not available. >> ) at /usr/src/sys/kern/kern_sx.c:575 >> #12 0xffffffff8034350e in _sx_xlock (sx=Variable "sx" is not >> available. >> ) at sx.h:155 >> #13 0xffffffff80af7596 in zfs_freebsd_reclaim () from /boot/kernel/ >> zfs.ko >> #14 0xffffffff805c5c2a in VOP_RECLAIM_APV (vop=0xffffff00090d5018, >> a=0xffffff00090d5000) at vnode_if.c:1926 >> #15 0xffffffff803c839e in vgonel (vp=0xffffff0009252588) at >> vnode_if.h:830 >> #16 0xffffffff803cc958 in vflush (mp=0xffffff0002cd7bc0, rootrefs=0, >> flags=0, >> td=0xffffff002cffe000) at /usr/src/sys/kern/vfs_subr.c:2449 >> #17 0xffffffff80af2038 in zfs_umount () from /boot/kernel/zfs.ko >> #18 0xffffffff803c55ca in dounmount (mp=0xffffff0002cd7bc0, >> flags=47020992, >> td=Variable "td" is not available. >> ) at /usr/src/sys/kern/vfs_mount.c:1289 >> #19 0xffffffff803c5df8 in unmount (td=0xffffff002cffe000, >> uap=0xffffff803e98bbf0) at /usr/src/sys/kern/vfs_mount.c:1174 >> #20 0xffffffff805980bf in syscall (frame=0xffffff803e98bc80) >> at /usr/src/sys/amd64/amd64/trap.c:984 >> #21 0xffffffff8057e2c1 in Xfast_syscall () >> at /usr/src/sys/amd64/amd64/exception.S:373 >> #22 0x000000080104e9ec in ?? () >> Previous frame inner to this frame (corrupt stack?) > > Looks like your zfs module is built without debugging symbols? > Maybe because was it built/rebuilt individually, not as part of > kernel build? > > It would be useful to get line number in frame 13 and examine sx > object in frame > 11, esp. sx_lock field. > > -- > Andriy Gapon The "this" (above) was referring to variable values not being available in a vmcore. :) The zfs module appears to be built with symbols, and the symbols appear to be loaded in kgdb: Reading symbols from /boot/kernel/zfs.ko...Reading symbols from / bootdir/boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /bootdir/boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko I didn't build the module(s) individually, either; in the previous cases, it was a clean buildworld/buildkernel (even with rm -rf /usr/ obj/* beforehand), and in this case "just" a buildkernel (no manual cleaning, but no -DNO_CLEAN either). Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 13:55:49 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 574E41065673; Wed, 29 Jul 2009 13:55:49 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 2D6B58FC24; Wed, 29 Jul 2009 13:55:47 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA03667; Wed, 29 Jul 2009 16:55:45 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A7054E1.5060402@icyb.net.ua> Date: Wed, 29 Jul 2009 16:55:45 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 13:55:50 -0000 on 29/07/2009 16:52 Thomas Backman said the following: >> >> It would be useful to get line number in frame 13 and examine sx >> object in frame >> 11, esp. sx_lock field. >> >> -- >> Andriy Gapon > The "this" (above) was referring to variable values not being available > in a vmcore. :) > > The zfs module appears to be built with symbols, and the symbols appear > to be loaded in kgdb: > > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from > /bootdir/boot/kernel/zfs.ko.symbols...done. done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from > /bootdir/boot/kernel/opensolaris.ko.symbols...done. done. > Loaded symbols for /boot/kernel/opensolaris.ko > > I didn't build the module(s) individually, either; in the previous > cases, it was a clean buildworld/buildkernel (even with rm -rf > /usr/obj/* beforehand), and in this case "just" a buildkernel (no manual > cleaning, but no -DNO_CLEAN either). Got it. No idea unfortunately :( Could you still please examine sx in frame 11? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 14:04:22 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9456A106564A for ; Wed, 29 Jul 2009 14:04:22 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id AE78F8FC0C for ; Wed, 29 Jul 2009 14:04:21 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id BE73A45C8A; Wed, 29 Jul 2009 16:04:18 +0200 (CEST) Received: from localhost (pjd-w.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 5613F45684; Wed, 29 Jul 2009 16:04:11 +0200 (CEST) Date: Wed, 29 Jul 2009 16:04:36 +0200 From: Pawel Jakub Dawidek To: Anthony Chavez Message-ID: <20090729140436.GG1586@garage.freebsd.pl> References: <4A62E0CE.1000508@hexadecagram.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Dzs2zDY0zgkG72+7" Content-Disposition: inline In-Reply-To: <4A62E0CE.1000508@hexadecagram.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Re-starting a gjournal provider X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 14:04:22 -0000 --Dzs2zDY0zgkG72+7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 19, 2009 at 03:01:02AM -0600, Anthony Chavez wrote: > Hello freebsd-fs, >=20 > I'm trying to get gjournal working on a "removable" hard disk. I use > the term loosely, because I'm using a very simple eSATA enclosure: an > AMS Venus DS5 [1]. >=20 > If I swap out disks, atacontrol cap ad0 seems sufficient enough to > detect the new drive: the reported device model, serial number, firmware > revision, and CHS values change as one would expect. >=20 > My interpretation of [2] section 5.3 and gjournal(8) is that the > following sequence of commands should ensure me that all write buffers > have been flushed and bring the system to a point where it is safe to > remove a disk. >=20 > sync; sync; sync > gjournal sync > umount /dev/ad0s1.journal > gjournal stop ad0s1.journal You should first unmount and then call 'gjournal sync'. > However, once they are executed, /dev/ad0s1.journal disappears and when > I swap out the disk it doesn't come back. The only way I've found to > bring it back is atacontrol detach ata0; atacontrol attach ata0, which > doesn't seem like a wise thing to do if I have another device on the > same channel. It doesn't come back because something (ATA layer?) doesn't properly remove ad0 provider. When you remove the disk, /dev/ad0 should disappear and reappear once you insert it again. You can still do this trick after you insert the disk again so the GEOM can schedule retaste: # true > /dev/ad0 > My question is, do I need to issue gjournal stop before I swap disks? > And if so, is there any way that I can avoid the atacontrol > detach/attach cycle that would need to take place before any mount is > attempted so that /dev/ad0s1.journal appears (if in the drive inserted > at the time does in fact utilize gjournal; I may want to experiment with > having disks with either gjournal or soft updates)? >=20 > And while I'm on the subject, are the (gjournal) syncs commands > preceeding umount absolutely necessary in the case of removable media? 'gjournal sync' should follow unmount, not the other way around. And its better to do it, but 'gjournal stop' should do the same. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --Dzs2zDY0zgkG72+7 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKcFb0ForvXbEpPzQRAn3BAKDFQ4AkRfSa7Ll0IqIByPC48sSIiACgjH2o cEJJA+z+AhtRw2LpfPgAChw= =3vmZ -----END PGP SIGNATURE----- --Dzs2zDY0zgkG72+7-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 14:08:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFE70106567D; Wed, 29 Jul 2009 14:08:16 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id AECCE8FC16; Wed, 29 Jul 2009 14:08:16 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 4AAA946B17; Wed, 29 Jul 2009 10:08:16 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 7931B8A0A2; Wed, 29 Jul 2009 10:08:15 -0400 (EDT) From: John Baldwin To: Rene Ladan Date: Wed, 29 Jul 2009 07:42:20 -0400 User-Agent: KMail/1.9.7 References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907281038.30277.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200907290742.20838.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Wed, 29 Jul 2009 10:08:15 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 14:08:17 -0000 On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: > 2009/7/28 John Baldwin : > > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: > >> 2009/7/28 John Baldwin : > >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > >> >> The following reply was made to PR kern/136945; it has been noted by > > GNATS. > >> >> > >> >> From: Rene Ladan > >> >> To: John Baldwin > >> >> Cc: bug-followup@freebsd.org > >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) > >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 > >> >> > >> >> =A02009/7/27 John Baldwin : > >> >> =A0> I would actually expect this to be the correct order for these= two > >> > locks.=3D > >> >> =A0 =3DA0Can > >> >> =A0> you capture the output of the 'debug.witness.fullgraph' sysctl= to a > > file? > >> >> =A0> > >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2. > >> > > >> > Hmm, the attachment was eaten by a grue, can you post the file=20 somewhere? > >> > > >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt > > > > Ok, it looks like it did encounter a UFS -> filedesc order at some=20 point. =A0Can > > you patch sys/kern/subr_witness.c to add a section to the order_lists[]= =20 array > > after the 'ZFS locking list' and before the spin locks list that looks= =20 like > > this: > > > > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, > > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, > > =A0 =A0 =A0 =A0{ NULL, NULL }, > > > The LOR seems to be gone, previously it showed up only once right > after booting the system. >=20 > But now a new LOR (according to the LOR page) seems pop up: > Trying to mount root from ufs:/dev/ad0s1a > lock order reversal: > 1st 0xffffff0002a4ad80 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 > 2nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ > /usr/src/sys/kern/kern_descrip.c:2478 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > _witness_debugger() at _witness_debugger+0x49 > witness_checkorder() at witness_checkorder+0x7ea > _sx_xlock() at _sx_xlock+0x44 > mountcheckdirs() at mountcheckdirs+0x80 > vfs_donmount() at vfs_donmount+0xfbf > kernel_mount() at kernel_mount+0xa1 > vfs_mountroot_try() at vfs_mountroot_try+0x177 > vfs_mountroot() at vfs_mountroot+0x47d > start_init() at start_init+0x62 > fork_exit() at fork_exit+0x12a > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- >=20 > The output of `df' and `mount' looks ok. Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() case shoul= d=20 be the normal order. I believe this should fix it. mountcheckdirs() doesn= 't=20 need the vnodes locked, it just needs the caller to hold references on them= =20 so they aren't recycled: =2D-- //depot/projects/smpng/sys/kern/vfs_mount.c#96 +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c @@ -1069,9 +1069,10 @@ vfs_event_signal(NULL, VQ_MOUNT, 0); if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp)) panic("mount: lost mount"); + VOP_UNLOCK(newdp, 0); + VOP_UNLOCK(vp, 0); mountcheckdirs(vp, newdp); =2D vput(newdp); =2D VOP_UNLOCK(vp, 0); + vrele(newdp); if ((mp->mnt_flag & MNT_RDONLY) =3D=3D 0) error =3D vfs_allocate_syncvnode(mp); vfs_unbusy(mp); =2D-=20 John Baldwin From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 14:11:06 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C36001065758; Wed, 29 Jul 2009 14:11:06 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 433928FC1F; Wed, 29 Jul 2009 14:11:00 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:43875 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MW9rX-0005o3-57; Wed, 29 Jul 2009 16:10:37 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id C5DF9BD9A4; Wed, 29 Jul 2009 16:10:36 +0200 (CEST) Message-Id: <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A7054E1.5060402@icyb.net.ua> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 16:10:34 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MW9rX-0005o3-57. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MW9rX-0005o3-57 63e7baf9d7f618b067cd51275cee9dd5 Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 14:11:08 -0000 On Jul 29, 2009, at 15:55, Andriy Gapon wrote: > on 29/07/2009 16:52 Thomas Backman said the following: >>> >>> It would be useful to get line number in frame 13 and examine sx >>> object in frame >>> 11, esp. sx_lock field. >>> >>> -- >>> Andriy Gapon >> The "this" (above) was referring to variable values not being >> available >> in a vmcore. :) >> >> The zfs module appears to be built with symbols, and the symbols >> appear >> to be loaded in kgdb: >> >> Reading symbols from /boot/kernel/zfs.ko...Reading symbols from >> /bootdir/boot/kernel/zfs.ko.symbols...done. done. >> Loaded symbols for /boot/kernel/zfs.ko >> Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols >> from >> /bootdir/boot/kernel/opensolaris.ko.symbols...done. done. >> Loaded symbols for /boot/kernel/opensolaris.ko >> >> I didn't build the module(s) individually, either; in the previous >> cases, it was a clean buildworld/buildkernel (even with rm -rf >> /usr/obj/* beforehand), and in this case "just" a buildkernel (no >> manual >> cleaning, but no -DNO_CLEAN either). > > Got it. No idea unfortunately :( > Could you still please examine sx in frame 11? Maybe... ;) If this isn't right, just tell me how: (kgdb) fr 11 #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff00090d5018, tid=18446742974952890368, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:575 575 owner = (struct thread *)SX_OWNER(x); (kgdb) list 570 * chain lock. If so, drop the sleep queue lock and try 571 * again. 572 */ 573 if (!(x & SX_LOCK_SHARED) && 574 (sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) { 575 owner = (struct thread *)SX_OWNER(x); 576 if (TD_IS_RUNNING(owner)) { 577 sleepq_release(&sx- >lock_object); 578 continue; 579 } (kgdb) p sx $3 = (struct sx *) 0xffffff00090d5018 (kgdb) x/x sx 0xffffff00090d5018: 0xffffffff80b5634c (kgdb) p *sx $8 = {lock_object = {lo_name = 0xffffffff80b5634c "zp->z_lock", lo_flags = 40894464 [0x2700000, btw], lo_data = 0, lo_witness = 0x0}, sx_lock = 6} ... as you might notice, I'm mostly clueless as to what I'm doing here. :o Hope that helps (a bit), though. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 14:36:05 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BCDD4106566B; Wed, 29 Jul 2009 14:36:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 927BA8FC23; Wed, 29 Jul 2009 14:36:04 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA04526; Wed, 29 Jul 2009 17:36:01 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4A705E50.8070307@icyb.net.ua> Date: Wed, 29 Jul 2009 17:36:00 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> In-Reply-To: <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 14:36:06 -0000 on 29/07/2009 17:10 Thomas Backman said the following: [snip] > (kgdb) fr 11 [snip] > (kgdb) p *sx > $8 = {lock_object = {lo_name = 0xffffffff80b5634c "zp->z_lock", lo_flags > = 40894464 [0x2700000, btw], lo_data = 0, lo_witness = 0x0}, > sx_lock = 6} > > ... as you might notice, I'm mostly clueless as to what I'm doing here. :o > Hope that helps (a bit), though. Yes, it does and a lot. sx_lock = 6 means that this sx lock is destroyed: #define SX_LOCK_DESTROYED \ (SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS) And lo_name tells that this is zp->z_lock. This lock is destroyed in zfs_znode_cache_destructor. Not enough knowledge for me to proceed further. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 15:20:23 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9C54106564A; Wed, 29 Jul 2009 15:20:23 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: from mail-ew0-f206.google.com (mail-ew0-f206.google.com [209.85.219.206]) by mx1.freebsd.org (Postfix) with ESMTP id EBDCB8FC15; Wed, 29 Jul 2009 15:20:22 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: by ewy2 with SMTP id 2so34204ewy.43 for ; Wed, 29 Jul 2009 08:20:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=q6dnARTvc0U/3OneWaMzVTfqkaNjZusTVh2NHtsa064=; b=m54uWpDj7YiJ60WRJYs1evKvGt4qXRCCZt5Fr7zkoUsmTS3m0CU3N2pQplIq9RWwD9 gVMdCi/vTebtBU4qvVtkICy2g0nOLBUsU+gzt2dZdIBEg+lzkTkWU4+6LlkETE0xsLKN e+HjX07wqbe8MuFFUoEts+m7BeFKa+QYqIazU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=nMtkjvO+ZeYWJrNu+r1VSbuofBn1kN8iDx0LI+AHNZLqD8fSvbz8/TfecnvOYjSJHz MW9SsFq8X4Ke/qqVpE1oPzq8njm8oCJXn8YGd1eVlkjdD/mb0M+BwkWmuWIt91VmZ2YO CbEwQ+VlCunNpl5umDpe9WLrjmO1zWaBUFEdE= MIME-Version: 1.0 Sender: r.c.ladan@gmail.com Received: by 10.216.93.13 with SMTP id k13mr1377101wef.75.1248880821972; Wed, 29 Jul 2009 08:20:21 -0700 (PDT) In-Reply-To: <200907290742.20838.jhb@freebsd.org> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907281038.30277.jhb@freebsd.org> <200907290742.20838.jhb@freebsd.org> Date: Wed, 29 Jul 2009 17:20:21 +0200 X-Google-Sender-Auth: 4f33c5949d6320da Message-ID: From: Rene Ladan To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 15:20:24 -0000 2009/7/29 John Baldwin : > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: >> 2009/7/28 John Baldwin : >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: >> >> 2009/7/28 John Baldwin : >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: >> >> >> The following reply was made to PR kern/136945; it has been noted = by >> > GNATS. >> >> >> >> >> >> From: Rene Ladan >> >> >> To: John Baldwin >> >> >> Cc: bug-followup@freebsd.org >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll= ) >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 >> >> >> >> >> >> =A02009/7/27 John Baldwin : >> >> >> =A0> I would actually expect this to be the correct order for thes= e two >> >> > locks.=3D >> >> >> =A0 =3DA0Can >> >> >> =A0> you capture the output of the 'debug.witness.fullgraph' sysct= l to a >> > file? >> >> >> =A0> >> >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2. >> >> > >> >> > Hmm, the attachment was eaten by a grue, can you post the file > somewhere? >> >> > >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt >> > >> > Ok, it looks like it did encounter a UFS -> filedesc order at some > point. =A0Can >> > you patch sys/kern/subr_witness.c to add a section to the order_lists[= ] > array >> > after the 'ZFS locking list' and before the spin locks list that looks > like >> > this: >> > >> > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, >> > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, >> > =A0 =A0 =A0 =A0{ NULL, NULL }, >> > >> The LOR seems to be gone, previously it showed up only once right >> after booting the system. >> >> But now a new LOR (according to the LOR page) seems pop up: >> Trying to mount root from ufs:/dev/ad0s1a >> lock order reversal: >> =A01st 0xffffff0002a4ad80 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:= 1465 >> =A02nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ >> /usr/src/sys/kern/kern_descrip.c:2478 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> _witness_debugger() at _witness_debugger+0x49 >> witness_checkorder() at witness_checkorder+0x7ea >> _sx_xlock() at _sx_xlock+0x44 >> mountcheckdirs() at mountcheckdirs+0x80 >> vfs_donmount() at vfs_donmount+0xfbf >> kernel_mount() at kernel_mount+0xa1 >> vfs_mountroot_try() at vfs_mountroot_try+0x177 >> vfs_mountroot() at vfs_mountroot+0x47d >> start_init() at start_init+0x62 >> fork_exit() at fork_exit+0x12a >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- >> >> The output of `df' and `mount' looks ok. > > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() case sho= uld > be the normal order. =A0I believe this should fix it. =A0mountcheckdirs()= doesn't > need the vnodes locked, it just needs the caller to hold references on th= em > so they aren't recycled: > > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > @@ -1069,9 +1069,10 @@ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_event_signal(NULL, VQ_MOUNT, 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp)) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("mount: lost mount")= ; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(newdp, 0); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mountcheckdirs(vp, newdp); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 vput(newdp); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 vrele(newdp); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if ((mp->mnt_flag & MNT_RDONLY) =3D=3D 0) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error =3D vfs_allocate_syn= cvnode(mp); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_unbusy(mp); > The LOR is still present, but at a different place without the mountcheckdirs() call (not on the LOR page either) : Trying to mount root from ufs:/dev/ad0s1a lock order reversal: 1st 0xffffff0002a4ad80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 2nd 0xffffff000233f048 filedesc structure (filedesc structure) @ /usr/src/sys/kern/vfs_mount.c:1485 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _sx_xlock() at _sx_xlock+0x44 set_rootvnode() at set_rootvnode+0x57 vfs_mountroot_try() at vfs_mountroot_try+0x371 vfs_mountroot() at vfs_mountroot+0x47d start_init() at start_init+0x62 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- Ren=E9 From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 15:35:24 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 168BB10656C1; Wed, 29 Jul 2009 15:35:24 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id DB4448FC16; Wed, 29 Jul 2009 15:35:23 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 7F8E446B64; Wed, 29 Jul 2009 11:35:23 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id AED868A0A2; Wed, 29 Jul 2009 11:35:22 -0400 (EDT) From: John Baldwin To: Rene Ladan Date: Wed, 29 Jul 2009 11:35:17 -0400 User-Agent: KMail/1.9.7 References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200907291135.17569.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Wed, 29 Jul 2009 11:35:22 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 15:35:24 -0000 On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: > 2009/7/29 John Baldwin : > > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: > >> 2009/7/28 John Baldwin : > >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: > >> >> 2009/7/28 John Baldwin : > >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > >> >> >> The following reply was made to PR kern/136945; it has been note= d=20 by > >> > GNATS. > >> >> >> > >> >> >> From: Rene Ladan > >> >> >> To: John Baldwin > >> >> >> Cc: bug-followup@freebsd.org > >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (po= ll) > >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 > >> >> >> > >> >> >> =A02009/7/27 John Baldwin : > >> >> >> =A0> I would actually expect this to be the correct order for th= ese=20 two > >> >> > locks.=3D > >> >> >> =A0 =3DA0Can > >> >> >> =A0> you capture the output of the 'debug.witness.fullgraph' sys= ctl=20 to a > >> > file? > >> >> >> =A0> > >> >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2. > >> >> > > >> >> > Hmm, the attachment was eaten by a grue, can you post the file > > somewhere? > >> >> > > >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt > >> > > >> > Ok, it looks like it did encounter a UFS -> filedesc order at some > > point. =A0Can > >> > you patch sys/kern/subr_witness.c to add a section to the order_list= s[] > > array > >> > after the 'ZFS locking list' and before the spin locks list that loo= ks > > like > >> > this: > >> > > >> > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, > >> > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, > >> > =A0 =A0 =A0 =A0{ NULL, NULL }, > >> > > >> The LOR seems to be gone, previously it showed up only once right > >> after booting the system. > >> > >> But now a new LOR (according to the LOR page) seems pop up: > >> Trying to mount root from ufs:/dev/ad0s1a > >> lock order reversal: > >> =A01st 0xffffff0002a4ad80 ufs (ufs)=20 @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 > >> =A02nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ > >> /usr/src/sys/kern/kern_descrip.c:2478 > >> KDB: stack backtrace: > >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > >> _witness_debugger() at _witness_debugger+0x49 > >> witness_checkorder() at witness_checkorder+0x7ea > >> _sx_xlock() at _sx_xlock+0x44 > >> mountcheckdirs() at mountcheckdirs+0x80 > >> vfs_donmount() at vfs_donmount+0xfbf > >> kernel_mount() at kernel_mount+0xa1 > >> vfs_mountroot_try() at vfs_mountroot_try+0x177 > >> vfs_mountroot() at vfs_mountroot+0x47d > >> start_init() at start_init+0x62 > >> fork_exit() at fork_exit+0x12a > >> fork_trampoline() at fork_trampoline+0xe > >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- > >> > >> The output of `df' and `mount' looks ok. > > > > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() case=20 should > > be the normal order. =A0I believe this should fix it. =A0mountcheckdirs= ()=20 doesn't > > need the vnodes locked, it just needs the caller to hold references on= =20 them > > so they aren't recycled: > > > > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 > > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > > @@ -1069,9 +1069,10 @@ > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_event_signal(NULL, VQ_MOUNT, 0); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp)) > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("mount: lost mount= "); > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(newdp, 0); > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mountcheckdirs(vp, newdp); > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 vput(newdp); > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 vrele(newdp); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if ((mp->mnt_flag & MNT_RDONLY) =3D=3D 0) > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error =3D vfs_allocate_s= yncvnode(mp); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_unbusy(mp); > > > The LOR is still present, but at a different place without the > mountcheckdirs() call (not on the LOR page either) : Ok, try this patch as well: =2D-- //depot/projects/smpng/sys/kern/vfs_mount.c#97 +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c @@ -1481,6 +1481,8 @@ if (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &rootvnode)) panic("Cannot find root vnode"); =20 + VOP_UNLOCK(rootvnode, 0); + p =3D curthread->td_proc; FILEDESC_XLOCK(p->p_fd); =20 @@ -1496,8 +1498,6 @@ =20 FILEDESC_XUNLOCK(p->p_fd); =20 =2D VOP_UNLOCK(rootvnode, 0); =2D EVENTHANDLER_INVOKE(mountroot); } =20 =2D-=20 John Baldwin From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 16:11:21 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AEAED106570C; Wed, 29 Jul 2009 16:11:21 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 2CDD98FC14; Wed, 29 Jul 2009 16:11:21 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:42542 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWBkE-0000Fc-52; Wed, 29 Jul 2009 18:11:12 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 07EBB164111; Wed, 29 Jul 2009 18:11:08 +0200 (CEST) Message-Id: <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A70728C.7020004@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 18:11:05 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWBkE-0000Fc-52. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWBkE-0000Fc-52 11cadf99453aff79f729591724eaf8de Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 16:11:23 -0000 On Jul 29, 2009, at 18:02, Andriy Gapon wrote: > on 29/07/2009 17:36 Andriy Gapon said the following: >> on 29/07/2009 17:10 Thomas Backman said the following: >> [snip] >>> (kgdb) fr 11 >> [snip] >>> (kgdb) p *sx >>> $8 = {lock_object = {lo_name = 0xffffffff80b5634c "zp->z_lock", >>> lo_flags >>> = 40894464 [0x2700000, btw], lo_data = 0, lo_witness = 0x0}, >>> sx_lock = 6} >>> >>> ... as you might notice, I'm mostly clueless as to what I'm doing >>> here. :o >>> Hope that helps (a bit), though. >> >> Yes, it does and a lot. >> sx_lock = 6 means that this sx lock is destroyed: >> #define >> SX_LOCK_DESTROYED \ >> (SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS) >> >> And lo_name tells that this is zp->z_lock. >> This lock is destroyed in zfs_znode_cache_destructor. >> Not enough knowledge for me to proceed further. > > So I guess that this is a case when zfs_znode_delete() was called on > znode that > was still referenced from some vnode. When the vnode gets reclaimed > we get this > problem. > Could you please examine vp in frame 15 or 16? > > -- > Andriy Gapon Sure. Lots of info in that one: (kgdb) fr 15 #15 0xffffffff803c839e in vgonel (vp=0xffffff0009252588) at vnode_if.h: 830 830 in vnode_if.h (kgdb) p *vp $3 = {v_type = VDIR, v_tag = 0xffffffff80b56347 "zfs", v_op = 0xffffffff80b5af00, v_data = 0xffffff00090d5000, v_mount = 0xffffff0002cd7bc0, v_nmntvnodes = {tqe_next = 0xffffff00090f5000, tqe_prev = 0xffffff0009252960}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0, vu_yield = 0}, v_hashlist = { le_next = 0x0, le_prev = 0x0}, v_hash = 0, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xffffff00092525e8}, v_cache_dd = 0x0, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_lock = { lock_object = {lo_name = 0xffffffff80b56347 "zfs", lo_flags = 91947008, lo_data = 0, lo_witness = 0x0}, lk_lock = 18446742974952890368, lk_timo = 51, lk_pri = 80}, v_interlock = {lock_object = { lo_name = 0xffffffff806126d9 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xffffff0009252620, v_holdcnt = 1, v_usecount = 0, v_iflag = 128, v_vflag = 0, v_writecount = 0, v_freelist = {tqe_next = 0xffffff00090c3760, tqe_prev = 0xffffff002c0bfc18}, v_bufobj = { bo_mtx = {lock_object = {lo_name = 0xffffffff806126e9 "bufobj interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00092526c0}, bv_root = 0x0, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00092526e0}, bv_root = 0x0, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_ops = 0xffffffff8079afa0, bo_bsize = 131072, bo_object = 0x0, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xffffff0009252588, __bo_vnode = 0xffffff0009252588}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0} Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 16:13:08 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 576331065674 for ; Wed, 29 Jul 2009 16:13:08 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 8ED068FC08 for ; Wed, 29 Jul 2009 16:13:07 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA07115; Wed, 29 Jul 2009 19:02:21 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A70728C.7020004@freebsd.org> Date: Wed, 29 Jul 2009 19:02:20 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> In-Reply-To: <4A705E50.8070307@icyb.net.ua> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 16:13:09 -0000 on 29/07/2009 17:36 Andriy Gapon said the following: > on 29/07/2009 17:10 Thomas Backman said the following: > [snip] >> (kgdb) fr 11 > [snip] >> (kgdb) p *sx >> $8 = {lock_object = {lo_name = 0xffffffff80b5634c "zp->z_lock", lo_flags >> = 40894464 [0x2700000, btw], lo_data = 0, lo_witness = 0x0}, >> sx_lock = 6} >> >> ... as you might notice, I'm mostly clueless as to what I'm doing here. :o >> Hope that helps (a bit), though. > > Yes, it does and a lot. > sx_lock = 6 means that this sx lock is destroyed: > #define SX_LOCK_DESTROYED \ > (SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS) > > And lo_name tells that this is zp->z_lock. > This lock is destroyed in zfs_znode_cache_destructor. > Not enough knowledge for me to proceed further. So I guess that this is a case when zfs_znode_delete() was called on znode that was still referenced from some vnode. When the vnode gets reclaimed we get this problem. Could you please examine vp in frame 15 or 16? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 17:14:10 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60D66106566B; Wed, 29 Jul 2009 17:14:10 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id D59418FC13; Wed, 29 Jul 2009 17:14:09 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:42834 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWCiU-0006Ie-4M; Wed, 29 Jul 2009 19:13:28 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id B58B21641C2; Wed, 29 Jul 2009 19:13:27 +0200 (CEST) Message-Id: <6C8097A7-1383-42C0-9A87-34C5065CA453@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A705E50.8070307@icyb.net.ua> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 19:13:24 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWCiU-0006Ie-4M. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWCiU-0006Ie-4M 02c6bc64cc9343b11c51a235d7b0c74b Cc: freebsd-fs@FreeBSD.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 17:14:10 -0000 On Jul 29, 2009, at 16:36, Andriy Gapon wrote: > on 29/07/2009 17:10 Thomas Backman said the following: > [snip] >> (kgdb) fr 11 > [snip] >> (kgdb) p *sx >> $8 = {lock_object = {lo_name = 0xffffffff80b5634c "zp->z_lock", >> lo_flags >> = 40894464 [0x2700000, btw], lo_data = 0, lo_witness = 0x0}, >> sx_lock = 6} >> >> ... as you might notice, I'm mostly clueless as to what I'm doing >> here. :o >> Hope that helps (a bit), though. > > Yes, it does and a lot. > sx_lock = 6 means that this sx lock is destroyed: > #define > SX_LOCK_DESTROYED \ > (SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS) > > And lo_name tells that this is zp->z_lock. > This lock is destroyed in zfs_znode_cache_destructor. > Not enough knowledge for me to proceed further. Also, FWIW: Without "options SMP", "zpool" simply goes into an uninterruptible sleep (state D+) on export. kill -9 has no effect, and the backup process just hangs. The rest of the system works great, but... yeah. (The block that causes the panic is wrapped by #ifdef ADAPTIVE_SX, which isn't defined without smp AFAIK.) Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 17:18:17 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4AC031065805; Wed, 29 Jul 2009 17:18:17 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 1DAE18FC16; Wed, 29 Jul 2009 17:18:15 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA08156; Wed, 29 Jul 2009 20:18:13 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A708455.5070304@freebsd.org> Date: Wed, 29 Jul 2009 20:18:13 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> In-Reply-To: <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 17:18:18 -0000 Thanks a lot again! Could you please try the following change? In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in function zfs_inactive() insert the following line: vrecycle(vp, curthread); before the following line: zfs_znode_free(zp); This is in "if (zp->z_dbuf == NULL)" branch. I hope that this should work in concert with the patch that Pawel has posted. P.S. Also Pawel has told me that adding 'CFLAGS+=-DDEBUG=1' to sys/modules/zfs/Makefile should enable additional debugging checks (ASSERTs) in ZFS code. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 18:06:17 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D1442106566B; Wed, 29 Jul 2009 18:06:17 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 477438FC0C; Wed, 29 Jul 2009 18:06:17 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:35325 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWDVg-0001Vc-3k; Wed, 29 Jul 2009 20:04:18 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 1E92A117D84; Wed, 29 Jul 2009 20:04:18 +0200 (CEST) Message-Id: <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A708455.5070304@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 20:04:15 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWDVg-0001Vc-3k. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWDVg-0001Vc-3k 90e01bd73b5a51a80d8601e56d6bb5b0 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 18:06:18 -0000 On Jul 29, 2009, at 19:18, Andriy Gapon wrote: > > Thanks a lot again! > > Could you please try the following change? > In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in > function > zfs_inactive() insert the following line: > vrecycle(vp, curthread); > before the following line: > zfs_znode_free(zp); > > This is in "if (zp->z_dbuf == NULL)" branch. > > I hope that this should work in concert with the patch that Pawel > has posted. > > P.S. > Also Pawel has told me that adding 'CFLAGS+=-DDEBUG=1' to sys/ > modules/zfs/Makefile > should enable additional debugging checks (ASSERTs) in ZFS code. > > -- > Andriy Gapon Thanks for your work :) However, bad news: it didn't help. It *might* have gotten us further, though, because the DDB backtrace now looks like this: _sx_xlock_hard() _sx_xlock() zfs_znode_free() zfs_freebsd_inactive() VOP_INACTIVE_APV() vinactive() vput() dounmount() unmount() syscall() XFast_syscall() KGDB: Unread portion of the kernel message buffer: kernel trap 9 with interrupts disabled Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80342b99 stack pointer = 0x28:0xffffff803e9b7910 frame pointer = 0x28:0xffffff803e9b7970 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 1398 (zpool) panic: from debugger cpuid = 0 KDB: stack backtrace: Uptime: 1m28s Physical memory: 2030 MB Dumping 1405 MB: ... Reading symbols: ... #9 0xffffffff805986aa in trap (frame=0xffffff803e9b7860) at /usr/src/ sys/amd64/amd64/trap.c:639 #10 0xffffffff8057dfe7 in calltrap () at /usr/src/sys/amd64/amd64/ exception.S:224 #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff0071019181, tid=18446742976093954048, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:575 #12 0xffffffff8034350e in _sx_xlock (sx=Variable "sx" is not available. ) at sx.h:155 #13 0xffffffff80ad6be7 in zfs_znode_free () from /boot/kernel/zfs.ko #14 0xffffffff80b5af20 in ?? () #15 0xffffff803e9b79f0 in ?? () #16 0xffffff0071032000 in ?? () #17 0xffffff803e9b79c0 in ?? () #18 0xffffffff80af719a in zfs_freebsd_inactive () from /boot/kernel/ zfs.ko #19 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffff0071101a48, a=0xffffff0071019181) at vnode_if.c:1863 #20 0xffffffff803c6aaa in vinactive (vp=0xffffff0071290938, td=0xffffff0071019001) at vnode_if.h:807 #21 0xffffffff803cbf26 in vput (vp=0xffffff0071290938) at /usr/src/sys/ kern/vfs_subr.c:2257 #22 0xffffffff803c57ef in dounmount (mp=0xffffff0002b9e8d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1333 #23 0xffffffff803c5df8 in unmount (td=0xffffff0071032000, uap=0xffffff803e9b7bf0) at /usr/src/sys/kern/vfs_mount.c:1174 #24 0xffffffff805980bf in syscall (frame=0xffffff803e9b7c80) at /usr/ src/sys/amd64/amd64/trap.c:984 #25 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/ exception.S:373 (kgdb) fr 22 #22 0xffffffff803c57ef in dounmount (mp=0xffffff0002b9e8d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1333 1333 vput(coveredvp); (kgdb) p *mp $1 = {mnt_mtx = {lock_object = {lo_name = 0xffffffff80611acd "struct mount mtx", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, mnt_gen = 2, mnt_list = {tqe_next = 0x0, tqe_prev = 0xffffff0002c71be8}, mnt_op = 0xffffffff80b5ae80, mnt_vfc = 0xffffffff80b5ae20, mnt_vnodecovered = 0xffffff0071290938, mnt_syncer = 0x0, mnt_ref = 0, mnt_nvnodelist = {tqh_first = 0x0, tqh_last = 0xffffff0002b9e930}, mnt_nvnodelistsize = 0, mnt_writeopcount = 0, mnt_kern_flag = 1627390088, mnt_flag = 4096, mnt_xflag = 0, mnt_noasync = 0, mnt_opt = 0xffffff0002f666f0, mnt_optnew = 0x0, mnt_maxsymlinklen = 0, mnt_stat = {f_version = 537068824, f_type = 4, f_flags = 4096, f_bsize = 131072, f_iosize = 131072, f_blocks = 486, f_bfree = 328, f_bavail = 328, f_files = 334, f_ffree = 328, f_syncwrites = 0, f_asyncwrites = 0, f_syncreads = 0, f_asyncreads = 0, f_spare = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = {val = {1968303680, -171280380}}, f_charspare = '\0' , f_fstypename = "zfs", '\0' , f_mntfromname = "crashtestslave/test_orig", '\0' , f_mntonname = "/crashtestslave/crashtestslave/test_orig", '\0' }, mnt_cred = 0xffffff0002f0b700, mnt_data = 0xffffff002489e000, mnt_time = 0, mnt_iosize_max = 65536, mnt_export = 0x0, mnt_label = 0x0, mnt_hashseed = 1597825977, mnt_lockref = 0, mnt_secondary_writes = 0, mnt_secondary_accwrites = 0, mnt_susp_owner = 0x0, mnt_gjprovider = 0x0, mnt_explock = {lock_object = { lo_name = 0xffffffff80611ade "explock", lo_flags = 91422720, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_timo = 0, lk_pri = 80}} Worth noting above: it's NOT the "pool root FS" that's being unmounted here. The panic can also be triggered on "zfs unmount crashtestslave/ test_orig" (i.e. not the root FS which was the only that panicked with zfs unmount, as opposed to zpool export, before). (kgdb) fr 21 #21 0xffffffff803cbf26 in vput (vp=0xffffff0071290938) at /usr/src/sys/ kern/vfs_subr.c:2257 2257 vinactive(vp, td); (kgdb) p *vp $3 = {v_type = VBAD, v_tag = 0xffffffff80600ff6 "none", v_op = 0xffffffff80779700, v_data = 0x0, v_mount = 0x0, v_nmntvnodes = {tqe_next = 0x0, tqe_prev = 0xffffff0071290b38}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0, vu_yield = 0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_hash = 0, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xffffff0071290998}, v_cache_dd = 0x0, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_lock = {lock_object = {lo_name = 0xffffffff80b56367 "zfs", lo_flags = 91947008, lo_data = 0, lo_witness = 0x0}, lk_lock = 18446742976093954048, lk_timo = 51, lk_pri = 80}, v_interlock = {lock_object = {lo_name = 0xffffffff806126d9 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xffffff00712909d0, v_holdcnt = 1, v_usecount = 0, v_iflag = 2176, v_vflag = 0, v_writecount = 0, v_freelist = {tqe_next = 0x0, tqe_prev = 0xffffff0002cecc18}, v_bufobj = {bo_mtx = {lock_object = {lo_name = 0xffffffff806126e9 "bufobj interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff0071290a70}, bv_root = 0x0, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff0071290a90}, bv_root = 0x0, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_ops = 0xffffffff8079afa0, bo_bsize = 131072, bo_object = 0x0, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xffffff0071290938, __bo_vnode = 0xffffff0071290938}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0} (kgdb) fr 11 #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff0071019181, tid=18446742976093954048, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:575 575 owner = (struct thread *)SX_OWNER(x); (kgdb) p *sx $4 = {lock_object = {lo_name = 0xffffffff80b571
, lo_flags = 160000, lo_data = 0, lo_witness = 0x100000000000000}, sx_lock = 16717361816799281152} Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 20:15:14 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B0E941065672; Wed, 29 Jul 2009 20:15:14 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 2E3728FC18; Wed, 29 Jul 2009 20:15:14 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:51115 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWFYM-0000YR-3p; Wed, 29 Jul 2009 22:15:12 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 2B3ED4429A; Wed, 29 Jul 2009 22:15:10 +0200 (CEST) Message-Id: <16B40A2B-A1B5-4528-8721-6D352E7D5419@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A708455.5070304@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Wed, 29 Jul 2009 22:15:06 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWFYM-0000YR-3p. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWFYM-0000YR-3p bc9334e9662bd9b399117d24cd624934 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 20:15:15 -0000 On Jul 29, 2009, at 19:18, Andriy Gapon wrote: > > Thanks a lot again! > > Could you please try the following change? > In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in > function > zfs_inactive() insert the following line: > vrecycle(vp, curthread); > before the following line: > zfs_znode_free(zp); > > This is in "if (zp->z_dbuf == NULL)" branch. > > I hope that this should work in concert with the patch that Pawel > has posted. > > P.S. > Also Pawel has told me that adding 'CFLAGS+=-DDEBUG=1' to sys/ > modules/zfs/Makefile > should enable additional debugging checks (ASSERTs) in ZFS code. > > -- > Andriy Gapon Better backtraces: Without your vrecycle() addition, and with the -DDEBUG=1 one (note to self: core.txt.32): Unread portion of the kernel message buffer: panic: solaris assert: ((zp)->z_vnode) == ((void *)0), file: /usr/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ zfs_znode.c, line: 1043 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 zfs_znode_free() at zfs_znode_free+0xef zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x4a vinactive() at vinactive+0x6a vput() at vput+0x1c6 dounmount() at dounmount+0x4af unmount() at unmount+0x3c8 syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (22, FreeBSD ELF64, unmount), rip = 0x80104e9ec, rsp = 0x7fffffffaa98, rbp = 0x801223300 --- KDB: enter: panic panic: from debugger cpuid = 0 Uptime: 1m5s Physical memory: 2034 MB Dumping 1405 MB: ... #11 0xffffffff8033a9cb in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:558 #12 0xffffffff80aed21f in zfs_znode_free () from /boot/kernel/zfs.ko #13 0xffffffff80b10a9a in zfs_freebsd_inactive () from /boot/kernel/ zfs.ko #14 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffffff80b88220, a=0xffffff00401b9a48) at vnode_if.c:1863 #15 0xffffffff803c6aaa in vinactive (vp=0xffffff004038c3b0, td=0xffffff0040031000) at vnode_if.h:807 #16 0xffffffff803cbf26 in vput (vp=0xffffff004038c3b0) at /usr/src/sys/kern/vfs_subr.c:2257 #17 0xffffffff803c57ef in dounmount (mp=0xffffff0001cea8d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1333 #18 0xffffffff803c5df8 in unmount (td=0xffffff0013adfab0, uap=0xffffff803ead0bf0) at /usr/src/sys/kern/vfs_mount.c:1174 #19 0xffffffff805980bf in syscall (frame=0xffffff803ead0c80) at /usr/src/sys/amd64/amd64/trap.c:984 #20 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/ amd64/exception.S:373 #21 0x000000080104e9ec in ?? () Previous frame inner to this frame (corrupt stack?) --------------------------- WITH the vrecycle() and -DDEBUG=1: kernel trap 9 with interrupts disabled Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80342b99 stack pointer = 0x28:0xffffff803eaf8910 frame pointer = 0x28:0xffffff803eaf8970 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 1414 (zpool) panic: from debugger cpuid = 0 KDB: stack backtrace: Uptime: 1m16s Physical memory: 2034 MB Dumping 1407 MB: ... #9 0xffffffff805986aa in trap (frame=0xffffff803eaf8860) at /usr/src/ sys/amd64/amd64/trap.c:639 #10 0xffffffff8057dfe7 in calltrap () at /usr/src/sys/amd64/amd64/ exception.S:224 #11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff0044136251, tid=18446742975340199936, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:575 #12 0xffffffff8034350e in _sx_xlock (sx=Variable "sx" is not available. ) at sx.h:155 #13 0xffffffff80aed172 in zfs_znode_free () from /boot/kernel/zfs.ko #14 0xffffffff80b10a8a in zfs_freebsd_inactive () from /boot/kernel/ zfs.ko #15 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffff0044136251, a=0xffffff0015b9cd38) at vnode_if.c:1863 #16 0xffffffff803c6aaa in vinactive (vp=0xffffff00443dc588, td=0xffffff0044136001) at vnode_if.h:807 #17 0xffffffff803cbf26 in vput (vp=0xffffff00443dc588) at /usr/src/sys/ kern/vfs_subr.c:2257 #18 0xffffffff803c57ef in dounmount (mp=0xffffff0001cc38d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1333 #19 0xffffffff803c5df8 in unmount (td=0xffffff004415c000, uap=0xffffff803eaf8bf0) at /usr/src/sys/kern/vfs_mount.c:1174 #20 0xffffffff805980bf in syscall (frame=0xffffff803eaf8c80) at /usr/ src/sys/amd64/amd64/trap.c:984 #21 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/ exception.S:373 #22 0x000000080104e9ec in ?? () Previous frame inner to this frame (corrupt stack?) Time to sleep. I only have the kernel.debug for the latter panic, by the way, but at a quick glance they appear to be the same except for the panic line...? Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Wed Jul 29 21:17:47 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 057A4106566C; Wed, 29 Jul 2009 21:17:47 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 3BA108FC1C; Wed, 29 Jul 2009 21:17:46 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 982C345C89; Wed, 29 Jul 2009 23:17:44 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 3276E45684; Wed, 29 Jul 2009 23:17:39 +0200 (CEST) Date: Wed, 29 Jul 2009 23:18:03 +0200 From: Pawel Jakub Dawidek To: Thomas Backman Message-ID: <20090729211803.GA2130@garage.freebsd.pl> References: <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <16B40A2B-A1B5-4528-8721-6D352E7D5419@exscape.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zYM0uCDKw75PZbzx" Content-Disposition: inline In-Reply-To: <16B40A2B-A1B5-4528-8721-6D352E7D5419@exscape.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org, FreeBSD current , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 21:17:47 -0000 --zYM0uCDKw75PZbzx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jul 29, 2009 at 10:15:06PM +0200, Thomas Backman wrote: > On Jul 29, 2009, at 19:18, Andriy Gapon wrote: >=20 > > > >Thanks a lot again! > > > >Could you please try the following change? > >In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in =20 > >function > >zfs_inactive() insert the following line: > > vrecycle(vp, curthread); > >before the following line: > > zfs_znode_free(zp); > > > >This is in "if (zp->z_dbuf =3D=3D NULL)" branch. > > > >I hope that this should work in concert with the patch that Pawel =20 > >has posted. > > > >P.S. > >Also Pawel has told me that adding 'CFLAGS+=3D-DDEBUG=3D1' to sys/=20 > >modules/zfs/Makefile > >should enable additional debugging checks (ASSERTs) in ZFS code. > > > >--=20 > >Andriy Gapon > Better backtraces: >=20 > Without your vrecycle() addition, and with the -DDEBUG=3D1 one (note to = =20 > self: core.txt.32): >=20 > Unread portion of the kernel message buffer: > panic: solaris assert: ((zp)->z_vnode) =3D=3D ((void *)0), file: /usr/src= /=20 > sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=20 > zfs_znode.c, line: 1043 Modify zfs_inactive() 'zp->z_dbuf =3D=3D NULL' case to look like this: if (zp->z_dbuf =3D=3D NULL) { /* * The fs has been unmounted, or we did a * suspend/resume and this file no longer exists. */ VI_LOCK(vp); vp->v_count =3D 0; /* count arrives as 1 */ vp->v_data =3D NULL; VI_UNLOCK(vp); rw_exit(&zfsvfs->z_teardown_inactive_lock); ZTOV(zp) =3D NULL; vrecycle(vp, curthread); zfs_znode_free(zp); return; } --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --zYM0uCDKw75PZbzx Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKcLyLForvXbEpPzQRAmHsAJ4gLjI2hH8yCsYy62NKANOywFmpbgCgotVG LG97BCENfOQuQ1Z72jkaMcQ= =orhi -----END PGP SIGNATURE----- --zYM0uCDKw75PZbzx-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 01:57:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FDBC106564A; Thu, 30 Jul 2009 01:57:49 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.cust.ramtel.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mx1.freebsd.org (Postfix) with ESMTP id F12758FC19; Thu, 30 Jul 2009 01:57:48 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mdounin.cust.ramtel.ru (Postfix) with ESMTP id AD8E31700F; Thu, 30 Jul 2009 05:38:57 +0400 (MSD) Date: Thu, 30 Jul 2009 05:38:57 +0400 From: Maxim Dounin To: freebsd-current@freebsd.org, freebsd-fs@freebsd.org Message-ID: <20090730013857.GB8794@mdounin.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.19 (2009-01-05) Cc: Subject: another zfs panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 01:57:49 -0000 Hello! Here is zfs panic I'm able to reproduce by running an scp from remote machine to zfs volume and 3 parallel untars of ports tree in cycle. Not sure that everything is required, but the above workload triggers panic in several hours. This is on fresh current with GENERIC kernel: panic: sx_xlock() of destroyed sx @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:535 cpuid = 6 KDB: enter: panic [thread pid 36 tid 100071 ] Stopped at kdb_enter+0x3d: movq $0,0x68a040(%rip) db> bt Tracing pid 36 tid 100071 td 0xffffff00040f3720 kdb_enter() at kdb_enter+0x3d panic() at panic+0x17b _sx_xlock() at _sx_xlock+0xfc zfs_range_unlock() at zfs_range_unlock+0x38 zfs_get_data() at zfs_get_data+0xc1 zil_commit() at zil_commit+0x532 zfs_sync() at zfs_sync+0xa6 sync_fsync() at sync_fsync+0x13a sync_vnode() at sync_vnode+0x157 sched_sync() at sched_sync+0x1d1 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80e7ee3d30, rbp = 0 --- Machine is otherwise idle. The only zfs-related tuning applied is compression=gzip-9. Please let me know if you want me to test some patches. Maxim Dounin From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 02:38:09 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4834106564A for ; Thu, 30 Jul 2009 02:38:09 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: from mail-qy0-f191.google.com (mail-qy0-f191.google.com [209.85.221.191]) by mx1.freebsd.org (Postfix) with ESMTP id 675448FC18 for ; Thu, 30 Jul 2009 02:38:09 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: by qyk29 with SMTP id 29so1658702qyk.3 for ; Wed, 29 Jul 2009 19:38:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=e5TOJ7Y6Le2eCRhDsguenDOsMfWz5edPR5W0MbVvxgo=; b=Y4TufWE8ZFMSE7biJ1mD4kqgZVow8ZvDikY5IVNE0MLm6KQk249NeIav8qnLecXFEI InAifyuhe1MqkYIhJ57CD67JX5+RcqSBXgsHAkY96ntl+fFhboGCfzSCXQicLVknLXsf hBKIdOLSumajh4AORCYNYNeusgrbwSY279xLc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=Z8Lh8zIljr7Aw6EpLjOBnNwGkzq7PnSQnQcYYwsnpYbFmpt9xTI0FmIRePPRgKEG4X VMnrHwW1r+mt1XfNwHFBPU9Q2QkGVN7fLkFzvF6hjSEAK+6xsMCz1H3/kimXA68nDVJW wmXzK50YRMdGKtwgMHYcmH+3k07dRQnLM6AlA= MIME-Version: 1.0 Received: by 10.229.74.77 with SMTP id t13mr121788qcj.7.1248920021526; Wed, 29 Jul 2009 19:13:41 -0700 (PDT) In-Reply-To: <20090730013857.GB8794@mdounin.ru> References: <20090730013857.GB8794@mdounin.ru> Date: Wed, 29 Jul 2009 21:13:41 -0500 Message-ID: <11167f520907291913i2718f784hf3d468284383eab1@mail.gmail.com> From: "Sam Fourman Jr." To: Maxim Dounin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another zfs panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 02:38:10 -0000 On Wed, Jul 29, 2009 at 8:38 PM, Maxim Dounin wrote: > Hello! > > Here is zfs panic I'm able to reproduce by running an scp from > remote machine to zfs volume and 3 parallel untars of ports tree > in cycle. =A0Not sure that everything is required, but the above > workload triggers panic in several hours. > > This is on fresh current with GENERIC kernel: > > panic: sx_xlock() of destroyed sx @ > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs= /zfs_rlock.c:535 > cpuid =3D 6 > KDB: enter: panic > [thread pid 36 tid 100071 ] > Stopped at =A0 =A0 =A0kdb_enter+0x3d: movq =A0 =A0$0,0x68a040(%rip) > db> bt > Tracing pid 36 tid 100071 td 0xffffff00040f3720 > kdb_enter() at kdb_enter+0x3d > panic() at panic+0x17b > _sx_xlock() at _sx_xlock+0xfc > zfs_range_unlock() at zfs_range_unlock+0x38 > zfs_get_data() at zfs_get_data+0xc1 > zil_commit() at zil_commit+0x532 > zfs_sync() at zfs_sync+0xa6 > sync_fsync() at sync_fsync+0x13a > sync_vnode() at sync_vnode+0x157 > sched_sync() at sched_sync+0x1d1 > fork_exit() at fork_exit+0x12a > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip =3D 0, rsp =3D 0xffffff80e7ee3d30, rbp =3D 0 --- > > Machine is otherwise idle. =A0The only zfs-related tuning applied is > compression=3Dgzip-9. > > Please let me know if you want me to test some patches. > > Maxim Dounin > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " what is the output of uname -a what is the contents of /boot/loader.conf Sam Fourman Jr From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 02:51:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72CBB106566B; Thu, 30 Jul 2009 02:51:49 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.cust.ramtel.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mx1.freebsd.org (Postfix) with ESMTP id 2C7698FC16; Thu, 30 Jul 2009 02:51:49 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mdounin.cust.ramtel.ru (Postfix) with ESMTP id 91B1B1701C; Thu, 30 Jul 2009 06:51:47 +0400 (MSD) Date: Thu, 30 Jul 2009 06:51:47 +0400 From: Maxim Dounin To: "Sam Fourman Jr." Message-ID: <20090730025147.GC8794@mdounin.ru> References: <20090730013857.GB8794@mdounin.ru> <11167f520907291913i2718f784hf3d468284383eab1@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <11167f520907291913i2718f784hf3d468284383eab1@mail.gmail.com> User-Agent: Mutt/1.5.19 (2009-01-05) Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another zfs panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 02:51:49 -0000 Hello! On Wed, Jul 29, 2009 at 09:13:41PM -0500, Sam Fourman Jr. wrote: > On Wed, Jul 29, 2009 at 8:38 PM, Maxim Dounin wrote: > > Hello! > > > > Here is zfs panic I'm able to reproduce by running an scp from > > remote machine to zfs volume and 3 parallel untars of ports tree > > in cycle. šNot sure that everything is required, but the above > > workload triggers panic in several hours. > > > > This is on fresh current with GENERIC kernel: > > > > panic: sx_xlock() of destroyed sx @ > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:535 [...] > what is the output of uname -a > what is the contents of /boot/loader.conf $ uname -a FreeBSD x0040.mgmt.vega.ru 8.0-BETA2 FreeBSD 8.0-BETA2 #0: Wed Jul 29 12:46:06 UTC 2009 root@x0040.mgmt.vega.ru:/usr/obj/usr/src/sys/GENERIC amd64 $ cat /boot/loader.conf beastie_disable="YES" geom_mirror_load="YES" hint.uart.0.flags="0" hint.uart.1.flags="0x10" Maxim Dounin From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 05:39:02 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE689106564A for ; Thu, 30 Jul 2009 05:39:02 +0000 (UTC) (envelope-from ludwigp@chip-web.com) Received: from toy2.chip-web.com (adsl-63-195-43-50.dsl.snfc21.pacbell.net [63.195.43.50]) by mx1.freebsd.org (Postfix) with SMTP id 580C58FC12 for ; Thu, 30 Jul 2009 05:39:02 +0000 (UTC) (envelope-from ludwigp@chip-web.com) Received: (qmail 62275 invoked from network); 30 Jul 2009 04:31:52 -0000 Received: from localhost.chip-web.com (HELO ?127.0.0.1?) (ludwigp@127.0.0.1) by localhost.chip-web.com with SMTP; 30 Jul 2009 04:31:52 -0000 Message-ID: <4A712290.9030308@chip-web.com> Date: Wed, 29 Jul 2009 21:33:20 -0700 From: Ludwig Pummer User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: ZFS raidz1 pool unavailable from losing 1 device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 05:39:02 -0000 Hello, I found myself with a 4-drive raidz1 pool that was put into the UNAVAIL state ("insufficient replicas") when 3 drives shows ONLINE and 1 showed UNAVAIL. Can anyone suggest how I can get out of this pickle? Here's the full backstory: My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded from 6.4-STABLE from last year. I just set up a ZFS raidz volume to replace a graid5 volume I had been using. I had it successfully set up using partitions across 4 disks, ad{6,8,10,12}s1e. Then I wanted to expand the raidz volume by merging the space from the adjacent disk partition. I thought I could just fail out the partition device in ZFS, edit the bsdlabel, and re-add the larger partition, ZFS would resilver, repeat until done. That's when I found out that ZFS doesn't let you fail out a device in a raidz volume. No big deal, I thought, I'll just go to single user mode and mess with the partition when ZFS isn't looking. When it comes back up it should notice that one of the device is gone, I can do a 'zfs replace' and continue my plan. Well, after rebooting to single user mode, combining partitions ad12s1d and ad12s1e (removed the d partiton), "zfs volinit", then "zpool status" just hung (Ctrl-C didn't kill it, so I rebooted). I thought this was a bit odd so I thought perhaps ZFS is confused by the ZFS metadata left on ad12s1e, so I blanked it out with "dd". That didn't help. I changed the name of the partition to ad12s1d thinking perhaps that would help. After that, "zfs volinit; zfs mount -a; zpool status" showed my raidz pool UNAVAIL with the message "insufficient replicas", ad{6,8,10}s1e ONLINE, and ad12s1e UNAVAIL "cannot open", and a more detailed message pointing me to http://www.sun.com/msg/ZFS-8000-3C. I tried doing a "zpool replace storage ad12s1e ad12s1d" but it refused, saying my zpool ("storage") was unavailable. Ditto for pretty much every zpool command I tried. "zpool clear" gave me a "permission denied" error. After some more searching of forums/mailing lists, I ran across one that suggested exporting & importing the zpool. I'm afraid that didn't fix my problem. The export worked, but now I cannot import the volume again ("cannot import 'storage': pool may be in use from other system", or with -f, "cannot import 'storage': one or more devices is currently unavailable"). Help! From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 05:49:44 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 781BE106564A; Thu, 30 Jul 2009 05:49:44 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id BBA1D8FC1B; Thu, 30 Jul 2009 05:49:43 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id C550145CD9; Thu, 30 Jul 2009 07:49:41 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 9BB5945C8A; Thu, 30 Jul 2009 07:49:36 +0200 (CEST) Date: Thu, 30 Jul 2009 07:50:01 +0200 From: Pawel Jakub Dawidek To: Maxim Dounin Message-ID: <20090730055001.GB2130@garage.freebsd.pl> References: <20090730013857.GB8794@mdounin.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="p4qYPpj5QlsIQJ0K" Content-Disposition: inline In-Reply-To: <20090730013857.GB8794@mdounin.ru> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another zfs panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 05:49:44 -0000 --p4qYPpj5QlsIQJ0K Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 30, 2009 at 05:38:57AM +0400, Maxim Dounin wrote: > Hello! >=20 > Here is zfs panic I'm able to reproduce by running an scp from=20 > remote machine to zfs volume and 3 parallel untars of ports tree=20 > in cycle. Not sure that everything is required, but the above=20 > workload triggers panic in several hours. >=20 > This is on fresh current with GENERIC kernel: >=20 > panic: sx_xlock() of destroyed sx @=20 > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs= /zfs_rlock.c:535 > cpuid =3D 6 > KDB: enter: panic > [thread pid 36 tid 100071 ] > Stopped at kdb_enter+0x3d: movq $0,0x68a040(%rip) > db> bt > Tracing pid 36 tid 100071 td 0xffffff00040f3720 > kdb_enter() at kdb_enter+0x3d > panic() at panic+0x17b > _sx_xlock() at _sx_xlock+0xfc > zfs_range_unlock() at zfs_range_unlock+0x38 > zfs_get_data() at zfs_get_data+0xc1 > zil_commit() at zil_commit+0x532 > zfs_sync() at zfs_sync+0xa6 > sync_fsync() at sync_fsync+0x13a > sync_vnode() at sync_vnode+0x157 > sched_sync() at sched_sync+0x1d1 > fork_exit() at fork_exit+0x12a > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip =3D 0, rsp =3D 0xffffff80e7ee3d30, rbp =3D 0 --- >=20 > Machine is otherwise idle. The only zfs-related tuning applied is=20 > compression=3Dgzip-9. >=20 > Please let me know if you want me to test some patches. The kernel syncer tries to sync vnode which has its znode already destroyed. There is one place (that we know of) where vrecycle() is missing. Could you try this patch: http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --p4qYPpj5QlsIQJ0K Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKcTSJForvXbEpPzQRAnA8AKCW8RInnvuPRaqbzWtUW6d/h121XgCfdgK1 ltQcddAqHrtc3JaVmnyjIlQ= =hpX/ -----END PGP SIGNATURE----- --p4qYPpj5QlsIQJ0K-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 07:05:01 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 215DA1065688; Thu, 30 Jul 2009 07:05:01 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 8334C8FC15; Thu, 30 Jul 2009 07:05:00 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:33499 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWPgt-0007jt-4x; Thu, 30 Jul 2009 09:04:41 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 11E86BD868; Thu, 30 Jul 2009 09:04:40 +0200 (CEST) Message-Id: <4FD5D430-9847-4333-AF47-00DE735E0E25@exscape.org> From: Thomas Backman To: Pawel Jakub Dawidek In-Reply-To: <20090729211803.GA2130@garage.freebsd.pl> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 09:04:38 +0200 References: <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <16B40A2B-A1B5-4528-8721-6D352E7D5419@exscape.org> <20090729211803.GA2130@garage.freebsd.pl> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWPgt-0007jt-4x. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWPgt-0007jt-4x 7001baa20cfa4ad814278fba78822f71 Cc: freebsd-fs@freebsd.org, FreeBSD current , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 07:05:02 -0000 On Jul 29, 2009, at 23:18, Pawel Jakub Dawidek wrote: > On Wed, Jul 29, 2009 at 10:15:06PM +0200, Thomas Backman wrote: >> On Jul 29, 2009, at 19:18, Andriy Gapon wrote: >> >>> >>> Thanks a lot again! >>> >>> Could you please try the following change? >>> In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in >>> function >>> zfs_inactive() insert the following line: >>> vrecycle(vp, curthread); >>> before the following line: >>> zfs_znode_free(zp); >>> >>> This is in "if (zp->z_dbuf == NULL)" branch. >>> >>> I hope that this should work in concert with the patch that Pawel >>> has posted. >>> >>> P.S. >>> Also Pawel has told me that adding 'CFLAGS+=-DDEBUG=1' to sys/ >>> modules/zfs/Makefile >>> should enable additional debugging checks (ASSERTs) in ZFS code. >>> >>> -- >>> Andriy Gapon >> Better backtraces: >> >> Without your vrecycle() addition, and with the -DDEBUG=1 one (note to >> self: core.txt.32): >> >> Unread portion of the kernel message buffer: >> panic: solaris assert: ((zp)->z_vnode) == ((void *)0), file: /usr/ >> src/ >> sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ >> zfs_znode.c, line: 1043 > > Modify zfs_inactive() 'zp->z_dbuf == NULL' case to look like this: > > if (zp->z_dbuf == NULL) { > /* > * The fs has been unmounted, or we did a > * suspend/resume and this file no longer exists. > */ > VI_LOCK(vp); > vp->v_count = 0; /* count arrives as 1 */ > vp->v_data = NULL; > VI_UNLOCK(vp); > rw_exit(&zfsvfs->z_teardown_inactive_lock); > ZTOV(zp) = NULL; > vrecycle(vp, curthread); > zfs_znode_free(zp); > return; > } New code, new panic. :( Same place as before, on exporting. panic: solaris assert: zp != ((void *)0), file: /usr/src/sys/modules/ zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, line: 4357 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 zfs_freebsd_reclaim() at zfs_freebsd_reclaim+0x1f2 VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0x4a vgonel() at vgonel+0x12e vrecycle() at vrecycle+0x7d zfs_inactive() at zfs_inactive+0x1aa zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x4a vinactive() at vinactive+0x6a vput() at vput+0x1c6 dounmount() at dounmount+0x4af unmount() at unmount+0x3c8 syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1--- syscall (22, FreeBSD ELF64, unmount), rip = 0x80104e9ec, rsp = 0x7fffffffaa98, rbp = 0x801223300 --- KDB: enter: panic [lockedvnods] 0xffffff000bf8f3b0: tag zfs, type VDIR usecount 0, writecount 0, refcount 1 mountedhere 0 flags (VI_DOOMED|VI_DOINGINACT) lock type zfs: EXCL by thread 0xffffff00450b0390 (pid 1400)panic: from debuggercpuid = 0 Uptime: 1m34s Physical memory: 2030 MB Dumping 1407 MB: ... #11 0xffffffff8033a9cb in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:558#12 0xffffffff80b110c2 in zfs_freebsd_reclaim () from /boot/kernel/zfs.ko #13 0xffffffff805c5c2a in VOP_RECLAIM_APV (vop=0x0, a=0xffffff803e9578f0) at vnode_if.c:1926 #14 0xffffffff803c839e in vgonel (vp=0xffffff000bf8f3b0) at vnode_if.h: 830 #15 0xffffffff803ca7ad in vrecycle (vp=0xffffff000bf8f3b0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_subr.c:2504 #16 0xffffffff80b109ea in zfs_inactive () from /boot/kernel/zfs.ko #17 0xffffffff80b88220 in ?? () #18 0xffffff803e9579f0 in ?? () #19 0xffffff00450b0390 in ?? () #20 0x0000000000000000 in ?? () #21 0xffffff803e957a40 in ?? () #22 0xffffff803e9579c0 in ?? () #23 0xffffffff80b10a9a in zfs_freebsd_inactive () from /boot/kernel/ zfs.ko #24 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffff000bf8f470, a=0xffffff0045146a48) at vnode_if.c:1863 #25 0xffffffff803c6aaa in vinactive (vp=0xffffff000bf8f3b0, td=0xffffff000bf8f3b0) at vnode_if.h:807 #26 0xffffffff803cbf26 in vput (vp=0xffffff000bf8f3b0) at /usr/src/ sys/kern/vfs_subr.c:2257 #27 0xffffffff803c57ef in dounmount (mp=0xffffff0002d0e8d0, flags=0, td=Variable "td" is not available. ) #28 0xffffffff803c5df8 in unmount (td=0xffffff00450b0390, uap=0xffffff803e957bf0) at /usr/src/sys/kern/vfs_mount.c:1174#29 0xffffffff805980bf in syscall (frame=0xffffff803e957c80) at /usr/src/sys/amd64/amd64/trap.c:984 #30 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373#31 0x000000080104e9ec in ?? ()Previous frame inner to this frame (corrupt stack?) BTW, here's my svn diff output (in /usr/src; one irrelevant patch not shown; I used your previous zfs_vnops.2.c patch and then replaced the if block as above): Index: sys/modules/zfs/Makefile =================================================================== --- sys/modules/zfs/Makefile (revision 195910) +++ sys/modules/zfs/Makefile (working copy) @@ -97,3 +97,4 @@ CWARNFLAGS+=-Wno-inline CWARNFLAGS+=-Wno-switch CWARNFLAGS+=-Wno-pointer-arith +CFLAGS+=-DDEBUG=1 Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c =================================================================== --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 195910) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy) @@ -3709,11 +3709,13 @@ * The fs has been unmounted, or we did a * suspend/resume and this file no longer exists. */ - mutex_enter(&zp->z_lock); VI_LOCK(vp); vp->v_count = 0; /* count arrives as 1 */ - mutex_exit(&zp->z_lock); + vp->v_data = NULL; + VI_UNLOCK(vp); rw_exit(&zfsvfs->z_teardown_inactive_lock); + ZTOV(zp) = NULL; + vrecycle(vp, curthread); zfs_znode_free(zp); return; } @@ -4351,7 +4353,6 @@ { vnode_t *vp = ap->a_vp; znode_t *zp = VTOZ(vp); - zfsvfs_t *zfsvfs; ASSERT(zp != NULL); @@ -4361,13 +4362,18 @@ vnode_destroy_vobject(vp); mutex_enter(&zp->z_lock); - ASSERT(zp->z_phys); + ASSERT(zp->z_phys != NULL); ZTOV(zp) = NULL; - if (!zp->z_unlinked) { + mutex_exit(&zp->z_lock); + + if (zp->z_unlinked) + ; /* Do nothing. */ + else if (zp->z_dbuf == NULL) + zfs_znode_free(zp); + else /* if (!zp->z_unlinked && zp->z_dbuf != NULL) */ { + zfsvfs_t *zfsvfs = zp->z_zfsvfs; int locked; - zfsvfs = zp->z_zfsvfs; - mutex_exit(&zp->z_lock); locked = MUTEX_HELD(ZFS_OBJ_MUTEX(zfsvfs, zp- >z_id)) ? 2 : ZFS_OBJ_HOLD_TRYENTER(zfsvfs, zp->z_id); if (locked == 0) { @@ -4383,8 +4389,6 @@ ZFS_OBJ_HOLD_EXIT(zfsvfs, zp->z_id); zfs_znode_free(zp); } - } else { - mutex_exit(&zp->z_lock); } VI_LOCK(vp); vp->v_data = NULL; Should I revert to the svn state and then change the if clause as above, or is this correct? Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 07:21:54 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C05A5106564A for ; Thu, 30 Jul 2009 07:21:54 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: from web37301.mail.mud.yahoo.com (web37301.mail.mud.yahoo.com [209.191.90.244]) by mx1.freebsd.org (Postfix) with SMTP id 7EE8C8FC18 for ; Thu, 30 Jul 2009 07:21:54 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: (qmail 11604 invoked by uid 60001); 30 Jul 2009 07:21:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1248938514; bh=E9DPUTM9Zkl0fHuZYPZwl9fSGGAMuwpZUFkrvLLLUG0=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=H54ms7JyUuVBL19eCWocPIGOdIfXMTQCSRoC8z+JevJIp+01tzyjDDz4avTNx9o0IA2MPxZmXqINa15BP3oV/LDfJrVueJ6Wj+g1YcPSjup0yPwU+973VFgFpLQmXrC1tfGtdiWv3KwuFcQKGMCSD9sA6XHCI30yJFDITXRXxCo= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=KA7fkJXTnhoQJkVdgNEmECCJdJkcN72d+QL6YUAbF4m9RuFqDdp8vkeQ/7HijD00bPcApltrQ5bITTUQ+Ye33QrSSzDU5t923QwU8prBj/+P+aD0ClwltQgSRA4k3Ubo+D+0MKfDZL31zQR/fE9th2AT04AQ7yRaEgWgFptd47I=; Message-ID: <46899.11156.qm@web37301.mail.mud.yahoo.com> X-YMail-OSG: EUGMqmQVM1kn2PW9AyTlKASSsDKvCHDToeYPOmr8E0P2jqW2lhCzQQvYTG4kiwDlRy4vy3TbRX3xLv6guCxoqXJzTUkdZtBheQqO2.YfhDnQUd2bS5tF0yKVvIzKPBf0VVlJyA8ckY_8qFggz9kIox3_R37aOijz7ROOBje37s7z8uemMy2wrZW8PmU8wKaP8NLWI1aJGDLnZ1Mj1EOWe6m0Xv3FeJCYuXjsOmNOw9VmzewNJAWJduT.UQrlJw2YUnAjEkWZX6pVGB8DgQWRu8WTutHQgOpviK0113TiS65HOu7Ru.sJRmdAnyzmckOjPZXscujfm.xRRAv4eJjWZLhy Received: from [213.147.110.159] by web37301.mail.mud.yahoo.com via HTTP; Thu, 30 Jul 2009 00:21:53 PDT X-Mailer: YahooMailRC/1358.22 YahooMailWebService/0.7.289.15 References: <4A712290.9030308@chip-web.com> Date: Thu, 30 Jul 2009 00:21:53 -0700 (PDT) From: Simun Mikecin To: Ludwig Pummer In-Reply-To: <4A712290.9030308@chip-web.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: ZFS raidz1 pool unavailable from losing 1 device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 07:21:55 -0000 Ludwin Pummer wrote: > My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded from 6.4-STABLE > from last year. I just set up a ZFS raidz volume to replace a graid5 volume I > had been using. I had it successfully set up using partitions across 4 disks, > ad{6,8,10,12}s1e. Then I wanted to expand the raidz volume by merging the space > from the adjacent disk partition. I thought I could just fail out the partition > device in ZFS, edit the bsdlabel, and re-add the larger partition, ZFS would > resilver, repeat until done. That's when I found out that ZFS doesn't let you > fail out a device in a raidz volume. No big deal, I thought, I'll just go to > single user mode and mess with the partition when ZFS isn't looking. When it > comes back up it should notice that one of the device is gone, I can do a 'zfs > replace' and continue my plan. > > Well, after rebooting to single user mode, combining partitions ad12s1d and > ad12s1e (removed the d partiton), "zfs volinit", then "zpool status" just hung > (Ctrl-C didn't kill it, so I rebooted). I thought this was a bit odd so I > thought perhaps ZFS is confused by the ZFS metadata left on ad12s1e, so I > blanked it out with "dd". That didn't help. I changed the name of the partition > to ad12s1d thinking perhaps that would help. After that, "zfs volinit; zfs mount > -a; zpool status" showed my raidz pool UNAVAIL with the message "insufficient > replicas", ad{6,8,10}s1e ONLINE, and ad12s1e UNAVAIL "cannot open", and a more > detailed message pointing me to http://www.sun.com/msg/ZFS-8000-3C. I tried > doing a "zpool replace storage ad12s1e ad12s1d" but it refused, saying my zpool > ("storage") was unavailable. Ditto for pretty much every zpool command I tried. > "zpool clear" gave me a "permission denied" error. Was your pool imported while you were repartitioning in single user mode? From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 07:34:36 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E15671065673 for ; Thu, 30 Jul 2009 07:34:36 +0000 (UTC) (envelope-from ludwigp@chip-web.com) Received: from toy2.chip-web.com (adsl-63-195-43-50.dsl.snfc21.pacbell.net [63.195.43.50]) by mx1.freebsd.org (Postfix) with SMTP id 8DA0E8FC1E for ; Thu, 30 Jul 2009 07:34:36 +0000 (UTC) (envelope-from ludwigp@chip-web.com) Received: (qmail 99641 invoked from network); 30 Jul 2009 07:24:27 -0000 Received: from localhost.chip-web.com (HELO ?127.0.0.1?) (ludwigp@127.0.0.1) by localhost.chip-web.com with SMTP; 30 Jul 2009 07:24:27 -0000 Message-ID: <4A714B03.6050704@chip-web.com> Date: Thu, 30 Jul 2009 00:25:55 -0700 From: Ludwig Pummer User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: Simun Mikecin References: <4A712290.9030308@chip-web.com> <46899.11156.qm@web37301.mail.mud.yahoo.com> In-Reply-To: <46899.11156.qm@web37301.mail.mud.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS raidz1 pool unavailable from losing 1 device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 07:34:37 -0000 Simun Mikecin wrote: > Ludwin Pummer wrote: > > >> My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded from 6.4-STABLE >> from last year. I just set up a ZFS raidz volume to replace a graid5 volume I >> had been using. I had it successfully set up using partitions across 4 disks, >> ad{6,8,10,12}s1e. Then I wanted to expand the raidz volume by merging the space >> from the adjacent disk partition. I thought I could just fail out the partition >> device in ZFS, edit the bsdlabel, and re-add the larger partition, ZFS would >> resilver, repeat until done. That's when I found out that ZFS doesn't let you >> fail out a device in a raidz volume. No big deal, I thought, I'll just go to >> single user mode and mess with the partition when ZFS isn't looking. When it >> comes back up it should notice that one of the device is gone, I can do a 'zfs >> replace' and continue my plan. >> >> Well, after rebooting to single user mode, combining partitions ad12s1d and >> ad12s1e (removed the d partiton), "zfs volinit", then "zpool status" just hung >> (Ctrl-C didn't kill it, so I rebooted). I thought this was a bit odd so I >> thought perhaps ZFS is confused by the ZFS metadata left on ad12s1e, so I >> blanked it out with "dd". That didn't help. I changed the name of the partition >> to ad12s1d thinking perhaps that would help. After that, "zfs volinit; zfs mount >> -a; zpool status" showed my raidz pool UNAVAIL with the message "insufficient >> replicas", ad{6,8,10}s1e ONLINE, and ad12s1e UNAVAIL "cannot open", and a more >> detailed message pointing me to http://www.sun.com/msg/ZFS-8000-3C. I tried >> doing a "zpool replace storage ad12s1e ad12s1d" but it refused, saying my zpool >> ("storage") was unavailable. Ditto for pretty much every zpool command I tried. >> "zpool clear" gave me a "permission denied" error. >> > > Was your pool imported while you were repartitioning in single user mode? > > Yes, I guess you could say it was. ZFS wasn't loaded while I was doing the repartitioning, though. --Ludwig From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 09:05:35 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DFBA106566B; Thu, 30 Jul 2009 09:05:35 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: from mail-ew0-f206.google.com (mail-ew0-f206.google.com [209.85.219.206]) by mx1.freebsd.org (Postfix) with ESMTP id 7008C8FC1C; Thu, 30 Jul 2009 09:05:33 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: by ewy2 with SMTP id 2so552419ewy.43 for ; Thu, 30 Jul 2009 02:05:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=9k17aCIyvHGz1ioyJyPM7Lb8RxDwi/SgxqV1EGBT+Ao=; b=KLaEsbuSvvTZGCvXLb9zcPN4Vurlz43aa6ftTBcQ8hdemez5r4Kf3/Mrz64RjN/AWJ jNS/8AD6tF9dZnhjnmHwM2ktb4sxNkqo5u26kGQdS2EDrwB41BxOSfUsRemRKXgudZH2 9xFNpFWRxKsILCBpp+3CqXfhiDRIFlVq2rYbg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=I3DJSgjfgPlKhYXneazQHVmwg5AKEQXbIcxm0qO0Oa/PwuUTnayRdEVvHKldrQ3B0S u9OzI6rRnUngfTiHKQ2+CezVLpvnv0U4cgcS3mJQzmbwlmCa6xdPqeczEPeff/XXPCuf HTP7soPS7prYKtWsPnteaR9Wl7IUyGMUQx5RY= MIME-Version: 1.0 Sender: r.c.ladan@gmail.com Received: by 10.216.20.210 with SMTP id p60mr186046wep.172.1248944732741; Thu, 30 Jul 2009 02:05:32 -0700 (PDT) In-Reply-To: <200907291135.17569.jhb@freebsd.org> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> <200907291135.17569.jhb@freebsd.org> Date: Thu, 30 Jul 2009 11:05:32 +0200 X-Google-Sender-Auth: 7880b114fe81c463 Message-ID: From: Rene Ladan To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 09:05:35 -0000 2009/7/29 John Baldwin : > On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: >> 2009/7/29 John Baldwin : >> > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: >> >> 2009/7/28 John Baldwin : >> >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: >> >> >> 2009/7/28 John Baldwin : >> >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: >> >> >> >> The following reply was made to PR kern/136945; it has been not= ed > by >> >> > GNATS. >> >> >> >> >> >> >> >> From: Rene Ladan >> >> >> >> To: John Baldwin >> >> >> >> Cc: bug-followup@freebsd.org >> >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (p= oll) >> >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 >> >> >> >> >> >> >> >> =A02009/7/27 John Baldwin : >> >> >> >> =A0> I would actually expect this to be the correct order for t= hese > two >> >> >> > locks.=3D >> >> >> >> =A0 =3DA0Can >> >> >> >> =A0> you capture the output of the 'debug.witness.fullgraph' sy= sctl > to a >> >> > file? >> >> >> >> =A0> >> >> >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-BETA2= . >> >> >> > >> >> >> > Hmm, the attachment was eaten by a grue, can you post the file >> > somewhere? >> >> >> > >> >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt >> >> > >> >> > Ok, it looks like it did encounter a UFS -> filedesc order at some >> > point. =A0Can >> >> > you patch sys/kern/subr_witness.c to add a section to the order_lis= ts[] >> > array >> >> > after the 'ZFS locking list' and before the spin locks list that lo= oks >> > like >> >> > this: >> >> > >> >> > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, >> >> > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, >> >> > =A0 =A0 =A0 =A0{ NULL, NULL }, >> >> > >> >> The LOR seems to be gone, previously it showed up only once right >> >> after booting the system. >> >> >> >> But now a new LOR (according to the LOR page) seems pop up: >> >> Trying to mount root from ufs:/dev/ad0s1a >> >> lock order reversal: >> >> =A01st 0xffffff0002a4ad80 ufs (ufs) > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 >> >> =A02nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ >> >> /usr/src/sys/kern/kern_descrip.c:2478 >> >> KDB: stack backtrace: >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> >> _witness_debugger() at _witness_debugger+0x49 >> >> witness_checkorder() at witness_checkorder+0x7ea >> >> _sx_xlock() at _sx_xlock+0x44 >> >> mountcheckdirs() at mountcheckdirs+0x80 >> >> vfs_donmount() at vfs_donmount+0xfbf >> >> kernel_mount() at kernel_mount+0xa1 >> >> vfs_mountroot_try() at vfs_mountroot_try+0x177 >> >> vfs_mountroot() at vfs_mountroot+0x47d >> >> start_init() at start_init+0x62 >> >> fork_exit() at fork_exit+0x12a >> >> fork_trampoline() at fork_trampoline+0xe >> >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- >> >> >> >> The output of `df' and `mount' looks ok. >> > >> > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() case > should >> > be the normal order. =A0I believe this should fix it. =A0mountcheckdir= s() > doesn't >> > need the vnodes locked, it just needs the caller to hold references on > them >> > so they aren't recycled: >> > >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c >> > @@ -1069,9 +1069,10 @@ >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_event_signal(NULL, VQ_MOUNT, 0); >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp)) >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("mount: lost moun= t"); >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(newdp, 0); >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mountcheckdirs(vp, newdp); >> > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 vput(newdp); >> > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 vrele(newdp); >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if ((mp->mnt_flag & MNT_RDONLY) =3D=3D = 0) >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error =3D vfs_allocate_= syncvnode(mp); >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_unbusy(mp); >> > >> The LOR is still present, but at a different place without the >> mountcheckdirs() call (not on the LOR page either) : > > Ok, try this patch as well: > > --- //depot/projects/smpng/sys/kern/vfs_mount.c#97 > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > @@ -1481,6 +1481,8 @@ > =A0 =A0 =A0 =A0if (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &rootv= node)) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("Cannot find root vnode"); > > + =A0 =A0 =A0 VOP_UNLOCK(rootvnode, 0); > + > =A0 =A0 =A0 =A0p =3D curthread->td_proc; > =A0 =A0 =A0 =A0FILEDESC_XLOCK(p->p_fd); > > @@ -1496,8 +1498,6 @@ > > =A0 =A0 =A0 =A0FILEDESC_XUNLOCK(p->p_fd); > > - =A0 =A0 =A0 VOP_UNLOCK(rootvnode, 0); > - > =A0 =A0 =A0 =A0EVENTHANDLER_INVOKE(mountroot); > =A0} > Still no luck, I now get a LOR that is similar to LOR 281 right after booti= ng: lock order reversal: 1st 0xffffff0002c2c7f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 2nd 0xffffff0002b2a248 filedesc structure (filedesc structure) @ /usr/src/sys/kern/vfs_syscalls.c:3776 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _sx_slock() at _sx_slock+0x44 kern_mkdirat() at kern_mkdirat+0x201 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (136, FreeBSD ELF64, mkdir), rip =3D 0x800729dac, rsp =3D 0x7fffffffec88, rbp =3D 0x7fffffffef66 --- Ren=E9 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 09:25:15 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA74F106564A; Thu, 30 Jul 2009 09:25:15 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id 2B35A8FC0A; Thu, 30 Jul 2009 09:25:14 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n6U9P74m098402 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 30 Jul 2009 12:25:07 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n6U9P74S054032; Thu, 30 Jul 2009 12:25:07 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n6U9P7TK054031; Thu, 30 Jul 2009 12:25:07 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 30 Jul 2009 12:25:07 +0300 From: Kostik Belousov To: Rene Ladan Message-ID: <20090730092507.GF1884@deviant.kiev.zoral.com.ua> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> <200907291135.17569.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tRjCiSMHexiP9I5N" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 09:25:16 -0000 --tRjCiSMHexiP9I5N Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 30, 2009 at 11:05:32AM +0200, Rene Ladan wrote: > 2009/7/29 John Baldwin : > > On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: > >> 2009/7/29 John Baldwin : > >> > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: > >> >> 2009/7/28 John Baldwin : > >> >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: > >> >> >> 2009/7/28 John Baldwin : > >> >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > >> >> >> >> The following reply was made to PR kern/136945; it has been n= oted > > by > >> >> > GNATS. > >> >> >> >> > >> >> >> >> From: Rene Ladan > >> >> >> >> To: John Baldwin > >> >> >> >> Cc: bug-followup@freebsd.org > >> >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs = (poll) > >> >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 > >> >> >> >> > >> >> >> >> =9A2009/7/27 John Baldwin : > >> >> >> >> =9A> I would actually expect this to be the correct order for= these > > two > >> >> >> > locks.=3D > >> >> >> >> =9A =3DA0Can > >> >> >> >> =9A> you capture the output of the 'debug.witness.fullgraph' = sysctl > > to a > >> >> > file? > >> >> >> >> =9A> > >> >> >> >> =9AYes, see attachment. =9AI'm still running the same 8.0-BET= A2. > >> >> >> > > >> >> >> > Hmm, the attachment was eaten by a grue, can you post the file > >> > somewhere? > >> >> >> > > >> >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt > >> >> > > >> >> > Ok, it looks like it did encounter a UFS -> filedesc order at some > >> > point. =9ACan > >> >> > you patch sys/kern/subr_witness.c to add a section to the order_l= ists[] > >> > array > >> >> > after the 'ZFS locking list' and before the spin locks list that = looks > >> > like > >> >> > this: > >> >> > > >> >> > =9A =9A =9A =9A{ "filedesc structure", &lock_class_sx }, > >> >> > =9A =9A =9A =9A{ "ufs", &lock_class_lockmgr}, > >> >> > =9A =9A =9A =9A{ NULL, NULL }, > >> >> > > >> >> The LOR seems to be gone, previously it showed up only once right > >> >> after booting the system. > >> >> > >> >> But now a new LOR (according to the LOR page) seems pop up: > >> >> Trying to mount root from ufs:/dev/ad0s1a > >> >> lock order reversal: > >> >> =9A1st 0xffffff0002a4ad80 ufs (ufs) > > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 > >> >> =9A2nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ > >> >> /usr/src/sys/kern/kern_descrip.c:2478 > >> >> KDB: stack backtrace: > >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > >> >> _witness_debugger() at _witness_debugger+0x49 > >> >> witness_checkorder() at witness_checkorder+0x7ea > >> >> _sx_xlock() at _sx_xlock+0x44 > >> >> mountcheckdirs() at mountcheckdirs+0x80 > >> >> vfs_donmount() at vfs_donmount+0xfbf > >> >> kernel_mount() at kernel_mount+0xa1 > >> >> vfs_mountroot_try() at vfs_mountroot_try+0x177 > >> >> vfs_mountroot() at vfs_mountroot+0x47d > >> >> start_init() at start_init+0x62 > >> >> fork_exit() at fork_exit+0x12a > >> >> fork_trampoline() at fork_trampoline+0xe > >> >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- > >> >> > >> >> The output of `df' and `mount' looks ok. > >> > > >> > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() case > > should > >> > be the normal order. =9AI believe this should fix it. =9Amountcheckd= irs() > > doesn't > >> > need the vnodes locked, it just needs the caller to hold references = on > > them > >> > so they aren't recycled: > >> > > >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 > >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > >> > @@ -1069,9 +1069,10 @@ > >> > =9A =9A =9A =9A =9A =9A =9A =9Avfs_event_signal(NULL, VQ_MOUNT, 0); > >> > =9A =9A =9A =9A =9A =9A =9A =9Aif (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp= )) > >> > =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Apanic("mount: lost mo= unt"); > >> > + =9A =9A =9A =9A =9A =9A =9A VOP_UNLOCK(newdp, 0); > >> > + =9A =9A =9A =9A =9A =9A =9A VOP_UNLOCK(vp, 0); > >> > =9A =9A =9A =9A =9A =9A =9A =9Amountcheckdirs(vp, newdp); > >> > - =9A =9A =9A =9A =9A =9A =9A vput(newdp); > >> > - =9A =9A =9A =9A =9A =9A =9A VOP_UNLOCK(vp, 0); > >> > + =9A =9A =9A =9A =9A =9A =9A vrele(newdp); > >> > =9A =9A =9A =9A =9A =9A =9A =9Aif ((mp->mnt_flag & MNT_RDONLY) =3D= =3D 0) > >> > =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Aerror =3D vfs_allocat= e_syncvnode(mp); > >> > =9A =9A =9A =9A =9A =9A =9A =9Avfs_unbusy(mp); > >> > > >> The LOR is still present, but at a different place without the > >> mountcheckdirs() call (not on the LOR page either) : > > > > Ok, try this patch as well: > > > > --- //depot/projects/smpng/sys/kern/vfs_mount.c#97 > > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > > @@ -1481,6 +1481,8 @@ > > =9A =9A =9A =9Aif (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &roo= tvnode)) > > =9A =9A =9A =9A =9A =9A =9A =9Apanic("Cannot find root vnode"); > > > > + =9A =9A =9A VOP_UNLOCK(rootvnode, 0); > > + > > =9A =9A =9A =9Ap =3D curthread->td_proc; > > =9A =9A =9A =9AFILEDESC_XLOCK(p->p_fd); > > > > @@ -1496,8 +1498,6 @@ > > > > =9A =9A =9A =9AFILEDESC_XUNLOCK(p->p_fd); > > > > - =9A =9A =9A VOP_UNLOCK(rootvnode, 0); > > - > > =9A =9A =9A =9AEVENTHANDLER_INVOKE(mountroot); > > =9A} > > >=20 > Still no luck, I now get a LOR that is similar to LOR 281 right after boo= ting: >=20 > lock order reversal: > 1st 0xffffff0002c2c7f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 > 2nd 0xffffff0002b2a248 filedesc structure (filedesc structure) @ > /usr/src/sys/kern/vfs_syscalls.c:3776 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > _witness_debugger() at _witness_debugger+0x49 > witness_checkorder() at witness_checkorder+0x7ea > _sx_slock() at _sx_slock+0x44 > kern_mkdirat() at kern_mkdirat+0x201 > syscall() at syscall+0x1af > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (136, FreeBSD ELF64, mkdir), rip =3D 0x800729dac, rsp =3D > 0x7fffffffec88, rbp =3D 0x7fffffffef66 --- Remove the FILEDESC_SLOCK()/FILEDESC_SUNLOCK() calls from kern_mkdirat(). --tRjCiSMHexiP9I5N Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkpxZvMACgkQC3+MBN1Mb4jYuwCfVFcNNfBmbW8V+8Xqb+Jfx6U4 /T4AnApBSyFpHdjgbkNzo+z8pSVmZi1p =CQtO -----END PGP SIGNATURE----- --tRjCiSMHexiP9I5N-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 12:11:52 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E230106564A; Thu, 30 Jul 2009 12:11:52 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 0054B8FC17; Thu, 30 Jul 2009 12:11:50 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA27064; Thu, 30 Jul 2009 15:11:47 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A718E03.6030909@freebsd.org> Date: Thu, 30 Jul 2009 15:11:47 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090630) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> In-Reply-To: <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 12:11:52 -0000 on 29/07/2009 21:04 Thomas Backman said the following: > Thanks for your work :) > However, bad news: it didn't help. It *might* have gotten us further, > though, because the DDB backtrace now looks like this: > > _sx_xlock_hard() > _sx_xlock() > zfs_znode_free() > zfs_freebsd_inactive() > VOP_INACTIVE_APV() > vinactive() > vput() > dounmount() > unmount() > syscall() > XFast_syscall() > Oh my bad. I missed the fact that recycle would do zfs_znode_free, so it seems like zfs_znode_free was called twice on the same znode. Could you please try replacing zfs_znode_free(zp); with vrecycle(vp, curthread); in the same block (instead of adding the latter before the former). Sorry, if this looks like shooting in the dark - because this is what it is. I am not familiar with the code and it's hard to follow all possibilities without good understanding. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 12:51:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7CFCF106564A; Thu, 30 Jul 2009 12:51:49 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id F3DD78FC1B; Thu, 30 Jul 2009 12:51:48 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:55335 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWV6Y-0002xN-5V; Thu, 30 Jul 2009 14:51:32 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id E98681734E3; Thu, 30 Jul 2009 14:51:30 +0200 (CEST) Message-Id: <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A718E03.6030909@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 14:51:28 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWV6Y-0002xN-5V. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWV6Y-0002xN-5V b5e743d7e5f9cfd0cf05591b9c5c7e44 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 12:51:49 -0000 On Jul 30, 2009, at 14:11, Andriy Gapon wrote: > on 29/07/2009 21:04 Thomas Backman said the following: >> Thanks for your work :) >> However, bad news: it didn't help. It *might* have gotten us further, >> though, because the DDB backtrace now looks like this: >> >> _sx_xlock_hard() >> _sx_xlock() >> zfs_znode_free() >> zfs_freebsd_inactive() >> VOP_INACTIVE_APV() >> vinactive() >> vput() >> dounmount() >> unmount() >> syscall() >> XFast_syscall() >> > > Oh my bad. I missed the fact that recycle would do zfs_znode_free, > so it seems > like zfs_znode_free was called twice on the same znode. > Could you please try replacing > zfs_znode_free(zp); > with > vrecycle(vp, curthread); > in the same block (instead of adding the latter before the former). > Sorry, if this looks like shooting in the dark - because this is > what it is. I am > not familiar with the code and it's hard to follow all possibilities > without good > understanding. New panic. :( Damnit! I think I'm using svn + http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch + your change, now... Unread portion of the kernel message buffer: GEOM_GATE: Device ggate1482 destroyed. panic: solaris assert: zp != ((void *)0), file: /usr/src/sys/modules/ zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, line: 4359 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 zfs_freebsd_reclaim() at zfs_freebsd_reclaim+0x244 VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0x4a vgonel() at vgonel+0x12e vrecycle() at vrecycle+0x7d zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x4a vinactive() at vinactive+0x6a vput() at vput+0x1c6 dounmount() at dounmount+0x4af unmount() at unmount+0x3c8 syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (22, FreeBSD ELF64, unmount), rip = 0x80104e9ec, rsp = 0x7fffffffaa98, rbp = 0x801223300 --- KDB: enter: panic 0xffffff00452971d8: tag zfs, type VDIR usecount 0, writecount 0, refcount 1 mountedhere 0 flags (VI_DOOMED|VI_DOINGINACT) lock type zfs: EXCL by thread 0xffffff0019ff6000 (pid 1425) panic: from debugger ... #11 0xffffffff8033a9cb in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:558 #12 0xffffffff80b11124 in zfs_freebsd_reclaim () from /boot/kernel/ zfs.ko #13 0xffffffff805c5c2a in VOP_RECLAIM_APV (vop=0x0, a=0xffffff803eaf8930) at vnode_if.c:1926 #14 0xffffffff803c839e in vgonel (vp=0xffffff00452971d8) at vnode_if.h: 830 #15 0xffffffff803ca7ad in vrecycle (vp=0xffffff00452971d8, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_subr.c:2504 #16 0xffffffff80b10aaa in zfs_freebsd_inactive () from /boot/kernel/ zfs.ko #17 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffffff80b882a0, a=0xffffff803eaf89f0) at vnode_if.c:1863 #18 0xffffffff803c6aaa in vinactive (vp=0xffffff00452971d8, td=0xffffff0019ff6000) at vnode_if.h:807 #19 0xffffffff803cbf26 in vput (vp=0xffffff00452971d8) at /usr/src/sys/kern/vfs_subr.c:2257 #20 0xffffffff803c57ef in dounmount (mp=0xffffff0001d058d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1333 #21 0xffffffff803c5df8 in unmount (td=0xffffff0019ff6000, uap=0xffffff803eaf8bf0) at /usr/src/sys/kern/vfs_mount.c:1174 #22 0xffffffff805980bf in syscall (frame=0xffffff803eaf8c80) at /usr/src/sys/amd64/amd64/trap.c:984 #23 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #24 0x000000080104e9ec in ?? () FWIW: Line 4359 (panic line): zfs_freebsd_reclaim(ap) ... { vnode_t *vp = ap->a_vp; znode_t *zp = VTOZ(vp); ASSERT(ap != NULL); // added by me ASSERT(vp != NULL); // added by me >>> ASSERT(zp != NULL); // line 4359 --------------- zfs_inactive(vnode_t *vp, cred_t *cr, caller_context_t *ct) { znode_t *zp = VTOZ(vp); zfsvfs_t *zfsvfs = zp->z_zfsvfs; int error; rw_enter(&zfsvfs->z_teardown_inactive_lock, RW_READER); if (zp->z_dbuf == NULL) { /* * The fs has been unmounted, or we did a * suspend/resume and this file no longer exists. */ VI_LOCK(vp); vp->v_count = 0; /* count arrives as 1 */ vp->v_data = NULL; VI_UNLOCK(vp); rw_exit(&zfsvfs->z_teardown_inactive_lock); ZTOV(zp) = NULL; vrecycle(vp, curthread); // zfs_znode_free(zp); return; } Regards, Thomas PS. ... and thanks again for working to solve this. :) From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 12:55:44 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B725A10656C4; Thu, 30 Jul 2009 12:55:44 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.cust.ramtel.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mx1.freebsd.org (Postfix) with ESMTP id 7006E8FC1D; Thu, 30 Jul 2009 12:55:44 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mdounin.cust.ramtel.ru (Postfix) with ESMTP id F40FD1700F; Thu, 30 Jul 2009 16:55:42 +0400 (MSD) Date: Thu, 30 Jul 2009 16:55:42 +0400 From: Maxim Dounin To: Pawel Jakub Dawidek Message-ID: <20090730125542.GI8794@mdounin.ru> References: <20090730013857.GB8794@mdounin.ru> <20090730055001.GB2130@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090730055001.GB2130@garage.freebsd.pl> User-Agent: Mutt/1.5.19 (2009-01-05) Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: another zfs panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 12:55:45 -0000 Hello! On Thu, Jul 30, 2009 at 07:50:01AM +0200, Pawel Jakub Dawidek wrote: > On Thu, Jul 30, 2009 at 05:38:57AM +0400, Maxim Dounin wrote: > > Hello! > > > > Here is zfs panic I'm able to reproduce by running an scp from > > remote machine to zfs volume and 3 parallel untars of ports tree > > in cycle. Not sure that everything is required, but the above > > workload triggers panic in several hours. > > > > This is on fresh current with GENERIC kernel: > > > > panic: sx_xlock() of destroyed sx @ > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:535 > > cpuid = 6 > > KDB: enter: panic > > [thread pid 36 tid 100071 ] > > Stopped at kdb_enter+0x3d: movq $0,0x68a040(%rip) > > db> bt > > Tracing pid 36 tid 100071 td 0xffffff00040f3720 > > kdb_enter() at kdb_enter+0x3d > > panic() at panic+0x17b > > _sx_xlock() at _sx_xlock+0xfc > > zfs_range_unlock() at zfs_range_unlock+0x38 > > zfs_get_data() at zfs_get_data+0xc1 > > zil_commit() at zil_commit+0x532 > > zfs_sync() at zfs_sync+0xa6 > > sync_fsync() at sync_fsync+0x13a > > sync_vnode() at sync_vnode+0x157 > > sched_sync() at sched_sync+0x1d1 > > fork_exit() at fork_exit+0x12a > > fork_trampoline() at fork_trampoline+0xe > > --- trap 0, rip = 0, rsp = 0xffffff80e7ee3d30, rbp = 0 --- > > > > Machine is otherwise idle. The only zfs-related tuning applied is > > compression=gzip-9. > > > > Please let me know if you want me to test some patches. > > The kernel syncer tries to sync vnode which has its znode already > destroyed. There is one place (that we know of) where vrecycle() is > missing. Could you try this patch: > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch Still here with patch applied: panic: sx_xlock() of destroyed sx @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:535 cpuid = 3 KDB: enter: panic [thread pid 37 tid 100072 ] Stopped at kdb_enter+0x3d: movq $0,0x68a040(%rip) db> bt Tracing pid 37 tid 100072 td 0xffffff00040f3390 kdb_enter() at kdb_enter+0x3d panic() at panic+0x17b _sx_xlock() at _sx_xlock+0xfc zfs_range_unlock() at zfs_range_unlock+0x38 zfs_get_data() at zfs_get_data+0xc1 zil_commit() at zil_commit+0x532 zfs_sync() at zfs_sync+0xa6 sync_fsync() at sync_fsync+0x13a sync_vnode() at sync_vnode+0x157 sched_sync() at sched_sync+0x1d1 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80e7ee8d30, rbp = 0 --- Maxim Dounin From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 12:55:50 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E6B9A10656C3; Thu, 30 Jul 2009 12:55:50 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.26]) by mx1.freebsd.org (Postfix) with ESMTP id 112398FC14; Thu, 30 Jul 2009 12:55:49 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: by ey-out-2122.google.com with SMTP id 9so313249eyd.7 for ; Thu, 30 Jul 2009 05:55:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=Gta97IkswNt7ZazMPqUeoh5HID++vhxuYlPPMOepidM=; b=eu1EFJykTV8ZXBY6iIbcfHx8+yCVIVKelPBpBaq3slOfnhQ0bdsuPA9T5IubpBpU9h hKbGvN4fh1InAVq77GyFNfKerZo2nDmZ/4zoWnTCt5rYgjC7J7Y3ywqu4myOCZueJxEm jfhf/49BR5JZsqbhfbjOdYNLloj362qCoGaU4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=HV4I0RCzJbl/63sK11ArrFodkB2NlswZRrmCuE0urweSLuPqe1DnXakRNIkVra/rZQ m9LzC2TRRcqZoU0YPLbgiz8SgLdIqp8mZjgnrxsgGtZXB11vnB5/sQNfIzfepmr3WpMI r9FFfRBwVkRgdTP4WhkbHgD1Kj3fkrPoTW4rA= MIME-Version: 1.0 Sender: r.c.ladan@gmail.com Received: by 10.216.37.212 with SMTP id y62mr225469wea.5.1248958548825; Thu, 30 Jul 2009 05:55:48 -0700 (PDT) In-Reply-To: <20090730092507.GF1884@deviant.kiev.zoral.com.ua> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> <200907291135.17569.jhb@freebsd.org> <20090730092507.GF1884@deviant.kiev.zoral.com.ua> Date: Thu, 30 Jul 2009 14:55:48 +0200 X-Google-Sender-Auth: 2a5a2ac6ff0afa6e Message-ID: From: Rene Ladan To: Kostik Belousov Content-Type: multipart/mixed; boundary=0016367f9694a45687046febd226 Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 12:55:51 -0000 --0016367f9694a45687046febd226 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable 2009/7/30 Kostik Belousov : > On Thu, Jul 30, 2009 at 11:05:32AM +0200, Rene Ladan wrote: >> 2009/7/29 John Baldwin : >> > On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: >> >> 2009/7/29 John Baldwin : >> >> > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: >> >> >> 2009/7/28 John Baldwin : >> >> >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: >> >> >> >> 2009/7/28 John Baldwin : >> >> >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: >> >> >> >> >> The following reply was made to PR kern/136945; it has been = noted >> > by >> >> >> > GNATS. >> >> >> >> >> >> >> >> >> >> From: Rene Ladan >> >> >> >> >> To: John Baldwin >> >> >> >> >> Cc: bug-followup@freebsd.org >> >> >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs= (poll) >> >> >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 >> >> >> >> >> >> >> >> >> >> =9A2009/7/27 John Baldwin : >> >> >> >> >> =9A> I would actually expect this to be the correct order fo= r these >> > two >> >> >> >> > locks.=3D >> >> >> >> >> =9A =3DA0Can >> >> >> >> >> =9A> you capture the output of the 'debug.witness.fullgraph'= sysctl >> > to a >> >> >> > file? >> >> >> >> >> =9A> >> >> >> >> >> =9AYes, see attachment. =9AI'm still running the same 8.0-BE= TA2. >> >> >> >> > >> >> >> >> > Hmm, the attachment was eaten by a grue, can you post the fil= e >> >> > somewhere? >> >> >> >> > >> >> >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt >> >> >> > >> >> >> > Ok, it looks like it did encounter a UFS -> filedesc order at so= me >> >> > point. =9ACan >> >> >> > you patch sys/kern/subr_witness.c to add a section to the order_= lists[] >> >> > array >> >> >> > after the 'ZFS locking list' and before the spin locks list that= looks >> >> > like >> >> >> > this: >> >> >> > >> >> >> > =9A =9A =9A =9A{ "filedesc structure", &lock_class_sx }, >> >> >> > =9A =9A =9A =9A{ "ufs", &lock_class_lockmgr}, >> >> >> > =9A =9A =9A =9A{ NULL, NULL }, >> >> >> > >> >> >> The LOR seems to be gone, previously it showed up only once right >> >> >> after booting the system. >> >> >> >> >> >> But now a new LOR (according to the LOR page) seems pop up: >> >> >> Trying to mount root from ufs:/dev/ad0s1a >> >> >> lock order reversal: >> >> >> =9A1st 0xffffff0002a4ad80 ufs (ufs) >> > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 >> >> >> =9A2nd 0xffffff0002b29a48 filedesc structure (filedesc structure) = @ >> >> >> /usr/src/sys/kern/kern_descrip.c:2478 >> >> >> KDB: stack backtrace: >> >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> >> >> _witness_debugger() at _witness_debugger+0x49 >> >> >> witness_checkorder() at witness_checkorder+0x7ea >> >> >> _sx_xlock() at _sx_xlock+0x44 >> >> >> mountcheckdirs() at mountcheckdirs+0x80 >> >> >> vfs_donmount() at vfs_donmount+0xfbf >> >> >> kernel_mount() at kernel_mount+0xa1 >> >> >> vfs_mountroot_try() at vfs_mountroot_try+0x177 >> >> >> vfs_mountroot() at vfs_mountroot+0x47d >> >> >> start_init() at start_init+0x62 >> >> >> fork_exit() at fork_exit+0x12a >> >> >> fork_trampoline() at fork_trampoline+0xe >> >> >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- >> >> >> >> >> >> The output of `df' and `mount' looks ok. >> >> > >> >> > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() ca= se >> > should >> >> > be the normal order. =9AI believe this should fix it. =9Amountcheck= dirs() >> > doesn't >> >> > need the vnodes locked, it just needs the caller to hold references= on >> > them >> >> > so they aren't recycled: >> >> > >> >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 >> >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c >> >> > @@ -1069,9 +1069,10 @@ >> >> > =9A =9A =9A =9A =9A =9A =9A =9Avfs_event_signal(NULL, VQ_MOUNT, 0); >> >> > =9A =9A =9A =9A =9A =9A =9A =9Aif (VFS_ROOT(mp, LK_EXCLUSIVE, &newd= p)) >> >> > =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Apanic("mount: lost m= ount"); >> >> > + =9A =9A =9A =9A =9A =9A =9A VOP_UNLOCK(newdp, 0); >> >> > + =9A =9A =9A =9A =9A =9A =9A VOP_UNLOCK(vp, 0); >> >> > =9A =9A =9A =9A =9A =9A =9A =9Amountcheckdirs(vp, newdp); >> >> > - =9A =9A =9A =9A =9A =9A =9A vput(newdp); >> >> > - =9A =9A =9A =9A =9A =9A =9A VOP_UNLOCK(vp, 0); >> >> > + =9A =9A =9A =9A =9A =9A =9A vrele(newdp); >> >> > =9A =9A =9A =9A =9A =9A =9A =9Aif ((mp->mnt_flag & MNT_RDONLY) =3D= =3D 0) >> >> > =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Aerror =3D vfs_alloca= te_syncvnode(mp); >> >> > =9A =9A =9A =9A =9A =9A =9A =9Avfs_unbusy(mp); >> >> > >> >> The LOR is still present, but at a different place without the >> >> mountcheckdirs() call (not on the LOR page either) : >> > >> > Ok, try this patch as well: >> > >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#97 >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c >> > @@ -1481,6 +1481,8 @@ >> > =9A =9A =9A =9Aif (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &ro= otvnode)) >> > =9A =9A =9A =9A =9A =9A =9A =9Apanic("Cannot find root vnode"); >> > >> > + =9A =9A =9A VOP_UNLOCK(rootvnode, 0); >> > + >> > =9A =9A =9A =9Ap =3D curthread->td_proc; >> > =9A =9A =9A =9AFILEDESC_XLOCK(p->p_fd); >> > >> > @@ -1496,8 +1498,6 @@ >> > >> > =9A =9A =9A =9AFILEDESC_XUNLOCK(p->p_fd); >> > >> > - =9A =9A =9A VOP_UNLOCK(rootvnode, 0); >> > - >> > =9A =9A =9A =9AEVENTHANDLER_INVOKE(mountroot); >> > =9A} >> > >> >> Still no luck, I now get a LOR that is similar to LOR 281 right after bo= oting: >> >> lock order reversal: >> =9A1st 0xffffff0002c2c7f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 >> =9A2nd 0xffffff0002b2a248 filedesc structure (filedesc structure) @ >> /usr/src/sys/kern/vfs_syscalls.c:3776 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> _witness_debugger() at _witness_debugger+0x49 >> witness_checkorder() at witness_checkorder+0x7ea >> _sx_slock() at _sx_slock+0x44 >> kern_mkdirat() at kern_mkdirat+0x201 >> syscall() at syscall+0x1af >> Xfast_syscall() at Xfast_syscall+0xe1 >> --- syscall (136, FreeBSD ELF64, mkdir), rip =3D 0x800729dac, rsp =3D >> 0x7fffffffec88, rbp =3D 0x7fffffffef66 --- > > Remove the FILEDESC_SLOCK()/FILEDESC_SUNLOCK() calls from kern_mkdirat(). > I removed the two lines at sys/kern/vfs_syscalls.c (3776 and 3778), but there still seem to be some LORs (attached). The two LORs about the reboot call are from before Kostiks pat= ch. --0016367f9694a45687046febd226 Content-Type: text/plain; charset=US-ASCII; name="dmesg-80b2-kib.txt" Content-Disposition: attachment; filename="dmesg-80b2-kib.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_fxrho36l1 RnJlZUJTRCA4LjAtQkVUQTIgIzI6IFRodSBKdWwgMzAgMDk6eHg6eHggQ0VTVCAyMDA5DQoNCmxv Y2sgb3JkZXIgcmV2ZXJzYWw6ICgjMjc2KQ0KDQpsb2NrIG9yZGVyIHJldmVyc2FsOg0KIDFzdCAw eGZmZmZmZjAwMDJiNTAyNzAgdWZzICh1ZnMpIEAgL3Vzci9zcmMvc3lzL2tlcm4vdmZzX21vdW50 LmM6MTIwMA0KIDJuZCAweGZmZmZmZmZmODBiZGNjYTAgYWxscHJvYyAoYWxscHJvYykgQCAvdXNy L3NyYy9zeXMva2Vybi9rZXJuX2Rlc2NyaXAuYzoyNDczDQpLREI6IHN0YWNrIGJhY2t0cmFjZToN CmRiX3RyYWNlX3NlbGZfd3JhcHBlcigpIGF0IGRiX3RyYWNlX3NlbGZfd3JhcHBlcisweDJhDQpf d2l0bmVzc19kZWJ1Z2dlcigpIGF0IF93aXRuZXNzX2RlYnVnZ2VyKzB4NDkNCndpdG5lc3NfY2hl Y2tvcmRlcigpIGF0IHdpdG5lc3NfY2hlY2tvcmRlcisweDdlYQ0KX3N4X3Nsb2NrKCkgYXQgX3N4 X3Nsb2NrKzB4NDQNCm1vdW50Y2hlY2tkaXJzKCkgYXQgbW91bnRjaGVja2RpcnMrMHgzZg0KZG91 bm1vdW50KCkgYXQgZG91bm1vdW50KzB4NDc3DQp2ZnNfdW5tb3VudGFsbCgpIGF0IHZmc191bm1v dW50YWxsKzB4NTQNCmJvb3QoKSBhdCBib290KzB4ODE4DQpta2R1bXBoZWFkZXIoKSBhdCBta2R1 bXBoZWFkZXINCnN5c2NhbGwoKSBhdCBzeXNjYWxsKzB4MWFmDQpYZmFzdF9zeXNjYWxsKCkgYXQg WGZhc3Rfc3lzY2FsbCsweGUxDQotLS0gc3lzY2FsbCAoNTUsIEZyZWVCU0QgRUxGNjQsIHJlYm9v dCksIHJpcCA9IDB4ODAwNzhmODVjLCByc3AgPSAweDdmZmZmZmZmZWFiOCwgcmJwID0gMCAtLS0N Cg0KDQpGcmVlQlNEIDguMC1CRVRBMiAjMzogVGh1IEp1bCAzMCAxMzoyOTo0NiBDRVNUIDIwMDkN Cg0KbG9jayBvcmRlciByZXZlcnNhbDoNCiAxc3QgMHhmZmZmZmYwMDUxMGE1ZDgwIHVmcyAodWZz KSBAIC91c3Ivc3JjL3N5cy9rZXJuL2tlcm5fZXhlYy5jOjU3MA0KIDJuZCAweGZmZmZmZjAwMDJk ZmUyNDggZmlsZWRlc2Mgc3RydWN0dXJlIChmaWxlZGVzYyBzdHJ1Y3R1cmUpIEAgL3Vzci9zcmMv c3lzL2tlcm4va2Vybl9kZXNjcmlwLmM6MTg2NA0KS0RCOiBzdGFjayBiYWNrdHJhY2U6DQpkYl90 cmFjZV9zZWxmX3dyYXBwZXIoKSBhdCBkYl90cmFjZV9zZWxmX3dyYXBwZXIrMHgyYQ0KX3dpdG5l c3NfZGVidWdnZXIoKSBhdCBfd2l0bmVzc19kZWJ1Z2dlcisweDQ5DQp3aXRuZXNzX2NoZWNrb3Jk ZXIoKSBhdCB3aXRuZXNzX2NoZWNrb3JkZXIrMHg3ZWENCl9zeF94bG9jaygpIGF0IF9zeF94bG9j aysweDQ0DQpzZXR1Z2lkc2FmZXR5KCkgYXQgc2V0dWdpZHNhZmV0eSsweDQwDQprZXJuX2V4ZWN2 ZSgpIGF0IGtlcm5fZXhlY3ZlKzB4ZjIyDQpleGVjdmUoKSBhdCBleGVjdmUrMHgzOA0Kc3lzY2Fs bCgpIGF0IHN5c2NhbGwrMHgxYWYNClhmYXN0X3N5c2NhbGwoKSBhdCBYZmFzdF9zeXNjYWxsKzB4 ZTENCi0tLSBzeXNjYWxsICg1OSwgRnJlZUJTRCBFTEY2NCwgZXhlY3ZlKSwgcmlwID0gMHg4MDA3 YzNkMGMsIHJzcCA9IDB4N2ZmZmZmZmZlYzQ4LCByYnAgPSAweDdmZmZmZmZmZWQ1MCAtLS0NCg0K bG9jayBvcmRlciByZXZlcnNhbDogKGxpa2UgIzI2MSkNCiAxc3QgMHhmZmZmZmY4MDI5NTEyNDM4 IGJ1ZndhaXQgKGJ1ZndhaXQpIEAgL3Vzci9zcmMvc3lzL2tlcm4vdmZzX2Jpby5jOjI1NTgNCiAy bmQgMHhmZmZmZmYwMDAyYzQ0NDAwIGRpcmhhc2ggKGRpcmhhc2gpIEAgL3Vzci9zcmMvc3lzL3Vm cy91ZnMvdWZzX2Rpcmhhc2guYzoyODUNCktEQjogc3RhY2sgYmFja3RyYWNlOg0KZGJfdHJhY2Vf c2VsZl93cmFwcGVyKCkgYXQgZGJfdHJhY2Vfc2VsZl93cmFwcGVyKzB4MmENCl93aXRuZXNzX2Rl YnVnZ2VyKCkgYXQgX3dpdG5lc3NfZGVidWdnZXIrMHg0OQ0Kd2l0bmVzc19jaGVja29yZGVyKCkg YXQgd2l0bmVzc19jaGVja29yZGVyKzB4N2VhDQpfc3hfeGxvY2soKSBhdCBfc3hfeGxvY2srMHg0 NA0KdWZzZGlyaGFzaF9hY3F1aXJlKCkgYXQgdWZzZGlyaGFzaF9hY3F1aXJlKzB4MjkNCnVmc2Rp cmhhc2hfbW92ZSgpIGF0IHVmc2Rpcmhhc2hfbW92ZSsweDE5DQp1ZnNfZGlyZW50ZXIoKSBhdCB1 ZnNfZGlyZW50ZXIrMHg0YTkNCnVmc19tYWtlaW5vZGUoKSBhdCB1ZnNfbWFrZWlub2RlKzB4MmE3 DQpWT1BfQ1JFQVRFX0FQVigpIGF0IFZPUF9DUkVBVEVfQVBWKzB4OGQNCnZuX29wZW5fY3JlZCgp IGF0IHZuX29wZW5fY3JlZCsweDQwNg0Ka2Vybl9vcGVuYXQoKSBhdCBrZXJuX29wZW5hdCsweDE2 Mw0Kc3lzY2FsbCgpIGF0IHN5c2NhbGwrMHgxYWYNClhmYXN0X3N5c2NhbGwoKSBhdCBYZmFzdF9z eXNjYWxsKzB4ZTENCi0tLSBzeXNjYWxsICg1LCBGcmVlQlNEIEVMRjY0LCBvcGVuKSwgcmlwID0g MHg4MDA5ZGFkZWMsIHJzcCA9IDB4N2ZmZmZmZmZlNWM4LCByYnAgPSAweDFiNiAtLS0NCg== --0016367f9694a45687046febd226-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:14:17 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 092CB1065674; Thu, 30 Jul 2009 13:14:17 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B7CA48FC25; Thu, 30 Jul 2009 13:14:15 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA28403; Thu, 30 Jul 2009 16:14:12 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A719CA4.4060400@freebsd.org> Date: Thu, 30 Jul 2009 16:14:12 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> In-Reply-To: <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:14:17 -0000 Thomas, I wasn't clear - please make sure that you have original zfs_inactive (without the changes that Pawel proposed) with the only change zfs_znode_free -> vrecycle. I.e.: if (zp->z_dbuf == NULL) { /* * The fs has been unmounted, or we did a * suspend/resume and this file no longer exists. */ mutex_enter(&zp->z_lock); VI_LOCK(vp); vp->v_count = 0; /* count arrives as 1 */ mutex_exit(&zp->z_lock); rw_exit(&zfsvfs->z_teardown_inactive_lock); vrecycle(vp, curthread); return; } I believe that the latest panic is a direct result of ZTOV(zp) = NULL line introduced in zfs_vnops.c.2.patch. reclaim function should stay patched with Pawel's patch. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:21:27 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9B5331065679; Thu, 30 Jul 2009 13:21:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id E677F8FC1E; Thu, 30 Jul 2009 13:21:26 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n6UDLMnG014222 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 30 Jul 2009 16:21:22 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n6UDLLS7015018; Thu, 30 Jul 2009 16:21:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n6UDLLY7015017; Thu, 30 Jul 2009 16:21:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 30 Jul 2009 16:21:21 +0300 From: Kostik Belousov To: Rene Ladan Message-ID: <20090730132121.GH1884@deviant.kiev.zoral.com.ua> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> <200907291135.17569.jhb@freebsd.org> <20090730092507.GF1884@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="9A1A73/U17WN0PFw" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:21:28 -0000 --9A1A73/U17WN0PFw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 30, 2009 at 02:55:48PM +0200, Rene Ladan wrote: >=20 > FreeBSD 8.0-BETA2 #3: Thu Jul 30 13:29:46 CEST 2009 >=20 > lock order reversal: > 1st 0xffffff00510a5d80 ufs (ufs) @ /usr/src/sys/kern/kern_exec.c:570 > 2nd 0xffffff0002dfe248 filedesc structure (filedesc structure) @ /usr/sr= c/sys/kern/kern_descrip.c:1864 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > _witness_debugger() at _witness_debugger+0x49 > witness_checkorder() at witness_checkorder+0x7ea > _sx_xlock() at _sx_xlock+0x44 > setugidsafety() at setugidsafety+0x40 > kern_execve() at kern_execve+0xf22 > execve() at execve+0x38 > syscall() at syscall+0x1af > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (59, FreeBSD ELF64, execve), rip =3D 0x8007c3d0c, rsp =3D 0x7= fffffffec48, rbp =3D 0x7fffffffed50 --- For this one, please replace the order of lines 676 and 677 in sys/kern/kern_exec.c, that is make it be VOP_UNLOCK(imgp->vp, 0); setugidsafety(td); instead of VOP_UNLOCK(imgp->vp, 0); setugidsafety(td); --9A1A73/U17WN0PFw Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEUEARECAAYFAkpxnlEACgkQC3+MBN1Mb4iQdACfcPg7eF9peTrsew6tkY65XfFY cOYAl2sGeCOh/N1xLW+hsf1oEXkv454= =PmIC -----END PGP SIGNATURE----- --9A1A73/U17WN0PFw-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:21:51 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C3C79106566B for ; Thu, 30 Jul 2009 13:21:51 +0000 (UTC) (envelope-from vinnix.bsd@gmail.com) Received: from mail-ew0-f206.google.com (mail-ew0-f206.google.com [209.85.219.206]) by mx1.freebsd.org (Postfix) with ESMTP id 1B4288FC24 for ; Thu, 30 Jul 2009 13:21:50 +0000 (UTC) (envelope-from vinnix.bsd@gmail.com) Received: by ewy2 with SMTP id 2so699067ewy.43 for ; Thu, 30 Jul 2009 06:21:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=0WJuobipT0AxAwxm1BBR37/FwDeHZLpAn/Af/D2KZzA=; b=rWVLRtsQHu6B/kdP6eNJyhjuOqXrB7/n7qgOfJyq3nucE/M2xK9RVKDi2ENFoSqTOr 2bTsxsCLMUqze4TXlN+j4ICmIkvCfyUNcDLI5qc9t/yxd5rvL3el+3u3dJla9liuZaJM puSOZ/I01O0whl+4t7zaz3iFA3ZXiz0jUt0iw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=MOVRZ/jF5sgnX28bTGLs1NZapUnV55irMINQQCElTwomvyy01ocPPEPsDJXbHJR5X8 tpr7wJgG0LDrjH+ARRCUk86MiG1y6JZl1Ay3ft8+ymY2DCV2f8ny9g3uSKzldmhA6pc7 8fSbzacmNPbE0/PJp5a2RmnCPHjyuQdRMjUqg= MIME-Version: 1.0 Received: by 10.210.16.17 with SMTP id 17mr1629452ebp.53.1248958446867; Thu, 30 Jul 2009 05:54:06 -0700 (PDT) Date: Thu, 30 Jul 2009 09:54:06 -0300 Message-ID: <1e31c7980907300554k46ab5a70mf0ed36f6ab7f2cd9@mail.gmail.com> From: Vinicius Abrahao To: freebsd-fs@freebsd.org, freebsd-current Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Subject: Can'not mount "sysid 15 (0x0f),(Extended DOS (LBA))" at da0s2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:21:52 -0000 Hello friends, I was create a new "FAT32" partition with Acronis Disk Director[1] at the same place where in the past I have a UFS partition. Now when I try to mount this new partiton I get this error: # mount /dev/da0s2 /mnt/usb1 mount: /dev/da0s2 : Invalid argument # mount -t msdosfs /dev/da0s2 /mnt/usb2 mount_msdosfs: /dev/da0s2: Invalid argument I wonder that the "strange thing" is the sysid 15 for this partition, no the sysid 12 found in another usb disk that I have here. Could you help me with this trouble? Thanks so much, Vinnix [1]: http://www.acronis.com/homecomputing/products/diskdirector/ [2]: # fdisk da0 ******* Working on device /dev/da0 ******* parameters extracted from in-core disklabel are: cylinders=9729 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=9729 heads=255 sectors/track=63 (16065 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 63, size 41929587 (20473 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 254/ sector 63 The data for partition 2 is: sysid 15 (0x0f),(Extended DOS (LBA)) start 41929650, size 114366735 (55843 Meg), flag 0 beg: cyl 1023/ head 0/ sector 1; end: cyl 1023/ head 254/ sector 63 The data for partition 3 is: The data for partition 4 is: [3]: # file -s /dev/da* /dev/da0: x86 boot sector; partition 1: ID=0xa5, active, starthead 1, startsector 63, 41929587 sectors; partition 2: ID=0xf, starthead 0, startsector 41929650, 114366735 sectors, code offset 0x31 /dev/da0s1: Unix Fast File system [v2] (little-endian) last mounted on /u02, last written at Wed Jul 1 19:12:35 2009, clean flag 1, readonly flag 0, number of blocks 10482396, number of data blocks 10150835, number of cylinder groups 112, block size 16384, fragment size 2048, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization /dev/da0s1a: Unix Fast File system [v2] (little-endian) last mounted on /, last written at Tue Jul 28 13:06:51 2009, clean flag 1, readonly flag 0, number of blocks 10482392, number of data blocks 10150831, number of cylinder groups 112, block size 16384, fragment size 2048, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization /dev/da0s2: x86 boot sector; partition 1: ID=0xb, starthead 1, startsector 63, 114366672 sectors, extended partition table (last)\011, code offset 0x0, BSD disklabel [4]: # uname -a FreeBSD vinnix.corp.triarius.com.br 8.0-BETA2 FreeBSD 8.0-BETA2 #9: Tue Jul 21 21:28:19 BRT 2009 root@vinnix.corp.triarius.com.br:/usr/obj/usr/src/sys/VINNIX amd64 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:25:21 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 77F2D1065672 for ; Thu, 30 Jul 2009 13:25:21 +0000 (UTC) (envelope-from spambox@haruhiism.net) Received: from fujibayashi.jp (karas.fujibayashi.jp [77.221.159.4]) by mx1.freebsd.org (Postfix) with ESMTP id 305DC8FC17 for ; Thu, 30 Jul 2009 13:25:21 +0000 (UTC) (envelope-from spambox@haruhiism.net) Received: from [192.168.0.2] (ppp91-122-47-189.pppoe.avangarddsl.ru [91.122.47.189]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by fujibayashi.jp (Postfix) with ESMTPSA id C471978F97; Thu, 30 Jul 2009 17:08:00 +0400 (MSD) Message-ID: <4A719B34.3000203@haruhiism.net> Date: Thu, 30 Jul 2009 17:08:04 +0400 From: Kamigishi Rei User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: Vinicius Abrahao References: <1e31c7980907300554k46ab5a70mf0ed36f6ab7f2cd9@mail.gmail.com> In-Reply-To: <1e31c7980907300554k46ab5a70mf0ed36f6ab7f2cd9@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current Subject: Re: Can'not mount "sysid 15 (0x0f),(Extended DOS (LBA))" at da0s2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:25:21 -0000 Vinicius Abrahao wrote: > I was create a new "FAT32" partition with Acronis Disk Director[1] at > the same place > where in the past I have a UFS partition. > Now when I try to mount this new partiton I get this error: > > # mount /dev/da0s2 /mnt/usb1 > mount: /dev/da0s2 : Invalid argument > > # mount -t msdosfs /dev/da0s2 /mnt/usb2 > mount_msdosfs: /dev/da0s2: Invalid argument > > The data for partition 2 is: > sysid 15 (0x0f),(Extended DOS (LBA)) da0s2 is an Extended DOS partition. It has no VFAT tables whatsoever, it's just a reference to the logical drive table later. Try using da0s5, for example, or just check the list of da0s* devices available. -- Kamigishi Rei KREI-RIPE From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:32:05 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11C2D1065673; Thu, 30 Jul 2009 13:32:05 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 880258FC16; Thu, 30 Jul 2009 13:32:04 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:55670 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWVjj-0007SM-4P; Thu, 30 Jul 2009 15:32:01 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 50FB6173529; Thu, 30 Jul 2009 15:32:00 +0200 (CEST) Message-Id: <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A719CA4.4060400@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 15:31:58 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWVjj-0007SM-4P. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWVjj-0007SM-4P a695d8c928608426409cbd7444cdc9d4 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:32:05 -0000 On Jul 30, 2009, at 15:14, Andriy Gapon wrote: > Thomas, > > I wasn't clear - please make sure that you have original > zfs_inactive (without the > changes that Pawel proposed) with the only change zfs_znode_free -> > vrecycle. > I.e.: > if (zp->z_dbuf == NULL) { > /* > * The fs has been unmounted, or we did a > * suspend/resume and this file no longer exists. > */ > mutex_enter(&zp->z_lock); > VI_LOCK(vp); > vp->v_count = 0; /* count arrives as 1 */ > mutex_exit(&zp->z_lock); > rw_exit(&zfsvfs->z_teardown_inactive_lock); > vrecycle(vp, curthread); > return; > } > > I believe that the latest panic is a direct result of ZTOV(zp) = > NULL line > introduced in zfs_vnops.c.2.patch. > > reclaim function should stay patched with Pawel's patch. Hey, it works!!! :) For the first time ever, my now mislabeled "clone_crash.sh" doesn't panic! A quick test of my ordinary, actually-used backup script also worked fine! For the record, here's the diff I got: http://exscape.org/temp/zfs_vnops.c.patch Thanks a lot! Hope to see this tested further (I'll do some more testing for sure) so that we can consider it a stable change. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:35:17 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6188D106566B; Thu, 30 Jul 2009 13:35:17 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 2DECB8FC13; Thu, 30 Jul 2009 13:35:15 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA28923; Thu, 30 Jul 2009 16:35:13 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A71A191.1080307@freebsd.org> Date: Thu, 30 Jul 2009 16:35:13 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> In-Reply-To: <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:35:17 -0000 on 30/07/2009 16:31 Thomas Backman said the following: > Hey, it works!!! :) > For the first time ever, my now mislabeled "clone_crash.sh" doesn't > panic! A quick test of my ordinary, actually-used backup script also > worked fine! > > For the record, here's the diff I got: > http://exscape.org/temp/zfs_vnops.c.patch > > Thanks a lot! Hope to see this tested further (I'll do some more testing > for sure) so that we can consider it a stable change. Very good! Thank you for all the testing and debugging feedback! And for your patience and persistence too :-) -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:52:11 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E071C1065674; Thu, 30 Jul 2009 13:52:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0ACF58FC0A; Thu, 30 Jul 2009 13:52:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 78E2946B32; Thu, 30 Jul 2009 09:52:10 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 877F48A0A4; Thu, 30 Jul 2009 09:52:09 -0400 (EDT) From: John Baldwin To: Kostik Belousov Date: Thu, 30 Jul 2009 08:45:27 -0400 User-Agent: KMail/1.9.7 References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <20090730092507.GF1884@deviant.kiev.zoral.com.ua> In-Reply-To: <20090730092507.GF1884@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200907300845.27663.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 30 Jul 2009 09:52:09 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org, Rene Ladan Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:52:12 -0000 On Thursday 30 July 2009 5:25:07 am Kostik Belousov wrote: > On Thu, Jul 30, 2009 at 11:05:32AM +0200, Rene Ladan wrote: > > 2009/7/29 John Baldwin : > > > On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: > > >> 2009/7/29 John Baldwin : > > >> > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: > > >> >> 2009/7/28 John Baldwin : > > >> >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: > > >> >> >> 2009/7/28 John Baldwin : > > >> >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > > >> >> >> >> The following reply was made to PR kern/136945; it has been= =20 noted > > > by > > >> >> > GNATS. > > >> >> >> >> > > >> >> >> >> From: Rene Ladan > > >> >> >> >> To: John Baldwin > > >> >> >> >> Cc: bug-followup@freebsd.org > > >> >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/uf= s=20 (poll) > > >> >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 > > >> >> >> >> > > >> >> >> >> =A02009/7/27 John Baldwin : > > >> >> >> >> =A0> I would actually expect this to be the correct order f= or=20 these > > > two > > >> >> >> > locks.=3D > > >> >> >> >> =A0 =3DA0Can > > >> >> >> >> =A0> you capture the output of the 'debug.witness.fullgraph= '=20 sysctl > > > to a > > >> >> > file? > > >> >> >> >> =A0> > > >> >> >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-B= ETA2. > > >> >> >> > > > >> >> >> > Hmm, the attachment was eaten by a grue, can you post the fi= le > > >> > somewhere? > > >> >> >> > > > >> >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt > > >> >> > > > >> >> > Ok, it looks like it did encounter a UFS -> filedesc order at s= ome > > >> > point. =A0Can > > >> >> > you patch sys/kern/subr_witness.c to add a section to the=20 order_lists[] > > >> > array > > >> >> > after the 'ZFS locking list' and before the spin locks list tha= t=20 looks > > >> > like > > >> >> > this: > > >> >> > > > >> >> > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, > > >> >> > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, > > >> >> > =A0 =A0 =A0 =A0{ NULL, NULL }, > > >> >> > > > >> >> The LOR seems to be gone, previously it showed up only once right > > >> >> after booting the system. > > >> >> > > >> >> But now a new LOR (according to the LOR page) seems pop up: > > >> >> Trying to mount root from ufs:/dev/ad0s1a > > >> >> lock order reversal: > > >> >> =A01st 0xffffff0002a4ad80 ufs (ufs) > > > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 > > >> >> =A02nd 0xffffff0002b29a48 filedesc structure (filedesc structure)= @ > > >> >> /usr/src/sys/kern/kern_descrip.c:2478 > > >> >> KDB: stack backtrace: > > >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > >> >> _witness_debugger() at _witness_debugger+0x49 > > >> >> witness_checkorder() at witness_checkorder+0x7ea > > >> >> _sx_xlock() at _sx_xlock+0x44 > > >> >> mountcheckdirs() at mountcheckdirs+0x80 > > >> >> vfs_donmount() at vfs_donmount+0xfbf > > >> >> kernel_mount() at kernel_mount+0xa1 > > >> >> vfs_mountroot_try() at vfs_mountroot_try+0x177 > > >> >> vfs_mountroot() at vfs_mountroot+0x47d > > >> >> start_init() at start_init+0x62 > > >> >> fork_exit() at fork_exit+0x12a > > >> >> fork_trampoline() at fork_trampoline+0xe > > >> >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- > > >> >> > > >> >> The output of `df' and `mount' looks ok. > > >> > > > >> > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() c= ase > > > should > > >> > be the normal order. =A0I believe this should fix it. =A0mountchec= kdirs() > > > doesn't > > >> > need the vnodes locked, it just needs the caller to hold reference= s=20 on > > > them > > >> > so they aren't recycled: > > >> > > > >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 > > >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > > >> > @@ -1069,9 +1069,10 @@ > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_event_signal(NULL, VQ_MOUNT, 0); > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (VFS_ROOT(mp, LK_EXCLUSIVE, &new= dp)) > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("mount: lost = mount"); > > >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(newdp, 0); > > >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mountcheckdirs(vp, newdp); > > >> > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 vput(newdp); > > >> > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > > >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 vrele(newdp); > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if ((mp->mnt_flag & MNT_RDONLY) =3D= =3D 0) > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error =3D vfs_alloc= ate_syncvnode(mp); > > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_unbusy(mp); > > >> > > > >> The LOR is still present, but at a different place without the > > >> mountcheckdirs() call (not on the LOR page either) : > > > > > > Ok, try this patch as well: > > > > > > --- //depot/projects/smpng/sys/kern/vfs_mount.c#97 > > > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > > > @@ -1481,6 +1481,8 @@ > > > =A0 =A0 =A0 =A0if (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &r= ootvnode)) > > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("Cannot find root vnode"); > > > > > > + =A0 =A0 =A0 VOP_UNLOCK(rootvnode, 0); > > > + > > > =A0 =A0 =A0 =A0p =3D curthread->td_proc; > > > =A0 =A0 =A0 =A0FILEDESC_XLOCK(p->p_fd); > > > > > > @@ -1496,8 +1498,6 @@ > > > > > > =A0 =A0 =A0 =A0FILEDESC_XUNLOCK(p->p_fd); > > > > > > - =A0 =A0 =A0 VOP_UNLOCK(rootvnode, 0); > > > - > > > =A0 =A0 =A0 =A0EVENTHANDLER_INVOKE(mountroot); > > > =A0} > > > > >=20 > > Still no luck, I now get a LOR that is similar to LOR 281 right after=20 booting: > >=20 > > lock order reversal: > > 1st 0xffffff0002c2c7f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 > > 2nd 0xffffff0002b2a248 filedesc structure (filedesc structure) @ > > /usr/src/sys/kern/vfs_syscalls.c:3776 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > _witness_debugger() at _witness_debugger+0x49 > > witness_checkorder() at witness_checkorder+0x7ea > > _sx_slock() at _sx_slock+0x44 > > kern_mkdirat() at kern_mkdirat+0x201 > > syscall() at syscall+0x1af > > Xfast_syscall() at Xfast_syscall+0xe1 > > --- syscall (136, FreeBSD ELF64, mkdir), rip =3D 0x800729dac, rsp =3D > > 0x7fffffffec88, rbp =3D 0x7fffffffef66 --- >=20 > Remove the FILEDESC_SLOCK()/FILEDESC_SUNLOCK() calls from kern_mkdirat(). Several other system calls have the same LOR and need the same fix. I've=20 consolidated all the fixes so far into=20 http://www.FreeBSD.org/~jhb/patches/vnode_filedesc.patch =2D-=20 John Baldwin From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 13:52:12 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0B900106567A; Thu, 30 Jul 2009 13:52:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id BFE988FC15; Thu, 30 Jul 2009 13:52:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 741F546B37; Thu, 30 Jul 2009 09:52:11 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 7AB9D8A0A5; Thu, 30 Jul 2009 09:52:10 -0400 (EDT) From: John Baldwin To: Rene Ladan Date: Thu, 30 Jul 2009 09:51:43 -0400 User-Agent: KMail/1.9.7 References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <20090730092507.GF1884@deviant.kiev.zoral.com.ua> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200907300951.44364.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 30 Jul 2009 09:52:10 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 13:52:12 -0000 On Thursday 30 July 2009 8:55:48 am Rene Ladan wrote: > 2009/7/30 Kostik Belousov : > > On Thu, Jul 30, 2009 at 11:05:32AM +0200, Rene Ladan wrote: > >> 2009/7/29 John Baldwin : > >> > On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: > >> >> 2009/7/29 John Baldwin : > >> >> > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: > >> >> >> 2009/7/28 John Baldwin : > >> >> >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: > >> >> >> >> 2009/7/28 John Baldwin : > >> >> >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: > >> >> >> >> >> The following reply was made to PR kern/136945; it has bee= n=20 noted > >> > by > >> >> >> > GNATS. > >> >> >> >> >> > >> >> >> >> >> From: Rene Ladan > >> >> >> >> >> To: John Baldwin > >> >> >> >> >> Cc: bug-followup@freebsd.org > >> >> >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/u= fs=20 (poll) > >> >> >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 > >> >> >> >> >> > >> >> >> >> >> =A02009/7/27 John Baldwin : > >> >> >> >> >> =A0> I would actually expect this to be the correct order = for=20 these > >> > two > >> >> >> >> > locks.=3D > >> >> >> >> >> =A0 =3DA0Can > >> >> >> >> >> =A0> you capture the output of the 'debug.witness.fullgrap= h'=20 sysctl > >> > to a > >> >> >> > file? > >> >> >> >> >> =A0> > >> >> >> >> >> =A0Yes, see attachment. =A0I'm still running the same 8.0-= BETA2. > >> >> >> >> > > >> >> >> >> > Hmm, the attachment was eaten by a grue, can you post the f= ile > >> >> > somewhere? > >> >> >> >> > > >> >> >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt > >> >> >> > > >> >> >> > Ok, it looks like it did encounter a UFS -> filedesc order at= =20 some > >> >> > point. =A0Can > >> >> >> > you patch sys/kern/subr_witness.c to add a section to the=20 order_lists[] > >> >> > array > >> >> >> > after the 'ZFS locking list' and before the spin locks list th= at=20 looks > >> >> > like > >> >> >> > this: > >> >> >> > > >> >> >> > =A0 =A0 =A0 =A0{ "filedesc structure", &lock_class_sx }, > >> >> >> > =A0 =A0 =A0 =A0{ "ufs", &lock_class_lockmgr}, > >> >> >> > =A0 =A0 =A0 =A0{ NULL, NULL }, > >> >> >> > > >> >> >> The LOR seems to be gone, previously it showed up only once right > >> >> >> after booting the system. > >> >> >> > >> >> >> But now a new LOR (according to the LOR page) seems pop up: > >> >> >> Trying to mount root from ufs:/dev/ad0s1a > >> >> >> lock order reversal: > >> >> >> =A01st 0xffffff0002a4ad80 ufs (ufs) > >> > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 > >> >> >> =A02nd 0xffffff0002b29a48 filedesc structure (filedesc structure= ) @ > >> >> >> /usr/src/sys/kern/kern_descrip.c:2478 > >> >> >> KDB: stack backtrace: > >> >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > >> >> >> _witness_debugger() at _witness_debugger+0x49 > >> >> >> witness_checkorder() at witness_checkorder+0x7ea > >> >> >> _sx_xlock() at _sx_xlock+0x44 > >> >> >> mountcheckdirs() at mountcheckdirs+0x80 > >> >> >> vfs_donmount() at vfs_donmount+0xfbf > >> >> >> kernel_mount() at kernel_mount+0xa1 > >> >> >> vfs_mountroot_try() at vfs_mountroot_try+0x177 > >> >> >> vfs_mountroot() at vfs_mountroot+0x47d > >> >> >> start_init() at start_init+0x62 > >> >> >> fork_exit() at fork_exit+0x12a > >> >> >> fork_trampoline() at fork_trampoline+0xe > >> >> >> --- trap 0, rip =3D 0, rsp =3D 0xffffff800001ad30, rbp =3D 0 --- > >> >> >> > >> >> >> The output of `df' and `mount' looks ok. > >> >> > > >> >> > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll()= =20 case > >> > should > >> >> > be the normal order. =A0I believe this should fix=20 it. =A0mountcheckdirs() > >> > doesn't > >> >> > need the vnodes locked, it just needs the caller to hold referenc= es=20 on > >> > them > >> >> > so they aren't recycled: > >> >> > > >> >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 > >> >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > >> >> > @@ -1069,9 +1069,10 @@ > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_event_signal(NULL, VQ_MOUNT, 0= ); > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (VFS_ROOT(mp, LK_EXCLUSIVE, &ne= wdp)) > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("mount: lost= mount"); > >> >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(newdp, 0); > >> >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mountcheckdirs(vp, newdp); > >> >> > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 vput(newdp); > >> >> > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(vp, 0); > >> >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 vrele(newdp); > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if ((mp->mnt_flag & MNT_RDONLY) = =3D=3D 0) > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error =3D vfs_allo= cate_syncvnode(mp); > >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vfs_unbusy(mp); > >> >> > > >> >> The LOR is still present, but at a different place without the > >> >> mountcheckdirs() call (not on the LOR page either) : > >> > > >> > Ok, try this patch as well: > >> > > >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#97 > >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c > >> > @@ -1481,6 +1481,8 @@ > >> > =A0 =A0 =A0 =A0if (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &= rootvnode)) > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("Cannot find root vnode"); > >> > > >> > + =A0 =A0 =A0 VOP_UNLOCK(rootvnode, 0); > >> > + > >> > =A0 =A0 =A0 =A0p =3D curthread->td_proc; > >> > =A0 =A0 =A0 =A0FILEDESC_XLOCK(p->p_fd); > >> > > >> > @@ -1496,8 +1498,6 @@ > >> > > >> > =A0 =A0 =A0 =A0FILEDESC_XUNLOCK(p->p_fd); > >> > > >> > - =A0 =A0 =A0 VOP_UNLOCK(rootvnode, 0); > >> > - > >> > =A0 =A0 =A0 =A0EVENTHANDLER_INVOKE(mountroot); > >> > =A0} > >> > > >> > >> Still no luck, I now get a LOR that is similar to LOR 281 right after= =20 booting: > >> > >> lock order reversal: > >> =A01st 0xffffff0002c2c7f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 > >> =A02nd 0xffffff0002b2a248 filedesc structure (filedesc structure) @ > >> /usr/src/sys/kern/vfs_syscalls.c:3776 > >> KDB: stack backtrace: > >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > >> _witness_debugger() at _witness_debugger+0x49 > >> witness_checkorder() at witness_checkorder+0x7ea > >> _sx_slock() at _sx_slock+0x44 > >> kern_mkdirat() at kern_mkdirat+0x201 > >> syscall() at syscall+0x1af > >> Xfast_syscall() at Xfast_syscall+0xe1 > >> --- syscall (136, FreeBSD ELF64, mkdir), rip =3D 0x800729dac, rsp =3D > >> 0x7fffffffec88, rbp =3D 0x7fffffffef66 --- > > > > Remove the FILEDESC_SLOCK()/FILEDESC_SUNLOCK() calls from kern_mkdirat(= ). > > > I removed the two lines at sys/kern/vfs_syscalls.c (3776 and 3778), > but there still seem to be some LORs > (attached). The two LORs about the reboot call are from before Kostiks=20 patch. Kostik has already suggested a fix for the second one. The first one is a = bit=20 harder to fix. :-/ The third one is a false LOR that you can ignore. =2D-=20 John Baldwin From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 14:24:46 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D25E51065670; Thu, 30 Jul 2009 14:24:46 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 907B08FC13; Thu, 30 Jul 2009 14:24:45 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA29938; Thu, 30 Jul 2009 17:24:42 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A71AD29.10705@freebsd.org> Date: Thu, 30 Jul 2009 17:24:41 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> In-Reply-To: <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 14:24:47 -0000 Could you please add DEBUG_VFS_LOCKS to kernel config and check that we haven't broke VFS locking with the patch? Thank you again! -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 14:38:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D9AE1065675; Thu, 30 Jul 2009 14:38:29 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: from mail-ew0-f206.google.com (mail-ew0-f206.google.com [209.85.219.206]) by mx1.freebsd.org (Postfix) with ESMTP id 9CFC28FC08; Thu, 30 Jul 2009 14:38:28 +0000 (UTC) (envelope-from r.c.ladan@gmail.com) Received: by ewy2 with SMTP id 2so757908ewy.43 for ; Thu, 30 Jul 2009 07:38:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=diD0+okxfs8DRB2iFAcBgcW3je+CtAjOl42/a2pMR1k=; b=wx97tIKESJEZkh24yqnbhgoR2xLK8lI2c5lx1gVR4f4D74GK8rw3zOwwCb2t3mAkH/ Vq+GrY9PG8nulb/HoORSYDsUIxKfi8f/uYMZKgz6QjL7SMsEuu6iFjUQcmU0v5JBW5wJ MjtoGbTXbocYqD3wdeJ4EgA13MJBk5V2n2Aug= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=G2iB4uRHQ8zeDm5FVF6ovk8QIG82XZmlEWc+rDjTPa86eCF1V46+GBg6nl2M0/xKqP qoKkojuKy2O8qgJnfSVVD0t0bfJ+N4yYNQvQnLHyyYFW+l3XPdeT8EUM4r7ipcIJ17oA 3z+tg4YaTdlnycWpo9k41RQhmmQLAbPrmtnBI= MIME-Version: 1.0 Sender: r.c.ladan@gmail.com Received: by 10.216.87.12 with SMTP id x12mr240488wee.48.1248964706157; Thu, 30 Jul 2009 07:38:26 -0700 (PDT) In-Reply-To: <20090730132121.GH1884@deviant.kiev.zoral.com.ua> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> <200907291135.17569.jhb@freebsd.org> <20090730092507.GF1884@deviant.kiev.zoral.com.ua> <20090730132121.GH1884@deviant.kiev.zoral.com.ua> Date: Thu, 30 Jul 2009 16:38:25 +0200 X-Google-Sender-Auth: 11df5f33dcea447f Message-ID: From: Rene Ladan To: Kostik Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 14:38:29 -0000 2009/7/30 Kostik Belousov : > On Thu, Jul 30, 2009 at 02:55:48PM +0200, Rene Ladan wrote: >> >> FreeBSD 8.0-BETA2 #3: Thu Jul 30 13:29:46 CEST 2009 >> >> lock order reversal: >> =A01st 0xffffff00510a5d80 ufs (ufs) @ /usr/src/sys/kern/kern_exec.c:570 >> =A02nd 0xffffff0002dfe248 filedesc structure (filedesc structure) @ /usr= /src/sys/kern/kern_descrip.c:1864 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> _witness_debugger() at _witness_debugger+0x49 >> witness_checkorder() at witness_checkorder+0x7ea >> _sx_xlock() at _sx_xlock+0x44 >> setugidsafety() at setugidsafety+0x40 >> kern_execve() at kern_execve+0xf22 >> execve() at execve+0x38 >> syscall() at syscall+0x1af >> Xfast_syscall() at Xfast_syscall+0xe1 >> --- syscall (59, FreeBSD ELF64, execve), rip =3D 0x8007c3d0c, rsp =3D 0x= 7fffffffec48, rbp =3D 0x7fffffffed50 --- > > For this one, please replace the order of lines 676 and 677 in > sys/kern/kern_exec.c, that is make it be > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0VOP_UNLOCK(imgp->vp, 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0setugidsafety(td); > instead of > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0VOP_UNLOCK(imgp->vp, 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0setugidsafety(td); > This patch seems to solve the LORs I got right after boot (only one about wpi is left which I already reported). Ren=E9 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 14:39:52 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F5A9106566B; Thu, 30 Jul 2009 14:39:52 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id A4AB48FC16; Thu, 30 Jul 2009 14:39:51 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:38977 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWWnB-0003BT-54; Thu, 30 Jul 2009 16:39:39 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id CAD001734E5; Thu, 30 Jul 2009 16:39:38 +0200 (CEST) Message-Id: <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A71AD29.10705@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 16:39:36 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWWnB-0003BT-54. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWWnB-0003BT-54 4ab95fe6bf6bdad971b99696663e497c Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 14:39:52 -0000 On Jul 30, 2009, at 16:24, Andriy Gapon wrote: > > Could you please add DEBUG_VFS_LOCKS to kernel config and check that > we haven't > broke VFS locking with the patch? > Thank you again! > > -- > Andriy Gapon Hey, thank *you* :) Currently recompiling the kernel, I'll have a look later. What do I do, though? Just keep an eye on the console, or something more involved? (Or, since the handbook mentions lockedvnods in ddb: when should I check lockedvnods?) BTW: Could you (or anyone else with knowledge in these areas) have a look at the libzfs_sendrecv patch? Final piece of the puzzle as far as all the panics (well, core dump in this case) I've ran in to is concerned. http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006814.html Or, in patch form (I think the intendation screws the patch up as linked there): http://exscape.org/temp/libzfs_sendrecv.patch Appears to be a pretty simple patch. I've tried writing a test case, but it's a bit of work to make it create separate pools, etc, so I'd rather skip that if possible. Without the patch, I can't get send -R - I (recursive + auto-incremental, i.e. you can do -I snap1 tank@snap4 instead of -i snap1 -i snap2 ...) to work without core dumping on the recv (sending to a file works just fine, but when receiving from the file, it core dumps; of course, the same is true for a pipe). Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 14:46:21 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E67751065670; Thu, 30 Jul 2009 14:46:21 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B4A4F8FC08; Thu, 30 Jul 2009 14:46:20 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA00523; Thu, 30 Jul 2009 17:46:18 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A71B239.8060007@freebsd.org> Date: Thu, 30 Jul 2009 17:46:17 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> In-Reply-To: <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 14:46:22 -0000 on 30/07/2009 17:39 Thomas Backman said the following: > On Jul 30, 2009, at 16:24, Andriy Gapon wrote: >> >> Could you please add DEBUG_VFS_LOCKS to kernel config and check that >> we haven't >> broke VFS locking with the patch? >> Thank you again! >> >> -- >> Andriy Gapon > Hey, thank *you* :) > Currently recompiling the kernel, I'll have a look later. What do I do, > though? Just keep an eye on the console, or something more involved? > (Or, since the handbook mentions lockedvnods in ddb: when should I check > lockedvnods?) I think you should get a panic if anything goes wrong. And I think you would get one :-) Next thing to try after that is the updated patch from Pawel: http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 14:49:02 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A4D631065687; Thu, 30 Jul 2009 14:49:02 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 738628FC2C; Thu, 30 Jul 2009 14:49:01 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA00539; Thu, 30 Jul 2009 17:48:59 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A71B2DA.9060902@freebsd.org> Date: Thu, 30 Jul 2009 17:48:58 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> In-Reply-To: <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 14:49:03 -0000 on 30/07/2009 17:39 Thomas Backman said the following: > Or, in patch form (I think the intendation screws the patch up as linked > there): > http://exscape.org/temp/libzfs_sendrecv.patch One comment on the patch - I personally don't like bit-wise xor in a logical expression. But if otherwise the expression would be huge and ugly, then OK. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 15:28:33 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 840251065690; Thu, 30 Jul 2009 15:28:33 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 004508FC15; Thu, 30 Jul 2009 15:28:32 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:42470 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWXVY-0001F4-68; Thu, 30 Jul 2009 17:25:30 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 5061E17B849; Thu, 30 Jul 2009 17:25:30 +0200 (CEST) Message-Id: <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exscape.org> From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A71B239.8060007@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 17:25:27 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWXVY-0001F4-68. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWXVY-0001F4-68 7917fb9efe12468d23bab37bb794f521 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 15:28:34 -0000 On Jul 30, 2009, at 16:46, Andriy Gapon wrote: > on 30/07/2009 17:39 Thomas Backman said the following: >> On Jul 30, 2009, at 16:24, Andriy Gapon wrote: >>> >>> Could you please add DEBUG_VFS_LOCKS to kernel config and check that >>> we haven't >>> broke VFS locking with the patch? >>> Thank you again! >>> >>> -- >>> Andriy Gapon >> Hey, thank *you* :) >> Currently recompiling the kernel, I'll have a look later. What do I >> do, >> though? Just keep an eye on the console, or something more involved? >> (Or, since the handbook mentions lockedvnods in ddb: when should I >> check >> lockedvnods?) > > I think you should get a panic if anything goes wrong. > And I think you would get one :-) > > Next thing to try after that is the updated patch from Pawel: > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch Well, damnit! KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a vfs_badlock() at vfs_badlock+0x95 VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0x4a vgonel() at vgonel+0x14d vrecycle() at vrecycle+0x8b zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x6c vinactive() at vinactive+0x85 vput() at vput+0x1d8 dounmount() at dounmount+0x4af unmount() at unmount+0x3c8 syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (22, FreeBSD ELF64, unmount), rip = 0x80104e9ec, rsp = 0x7fffffffaa98, rbp = 0x801223300 --- VOP_RECLAIM: 0xffffff007b0ff1d8 interlock is locked but should not be KDB: enter: lock violation panic: from debugger #9 0xffffffff8057eda7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80613fd5 "vfslock", msg=0xa
) at cpufunc.h:63 #11 0xffffffff805c82fa in VOP_RECLAIM_APV (vop=0xffffffff80b8b220, a=0xffffff803ea09930) at vnode_if.c:1923 #12 0xffffffff803cbd5d in vgonel (vp=0xffffff007b0ff1d8) at vnode_if.h: 830 #13 0xffffffff803cc10b in vrecycle (vp=0xffffff007b0ff1d8, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_subr.c:2504 #14 0xffffffff80b13a9a in zfs_freebsd_inactive () from /boot/kernel/ zfs.ko #15 0xffffffff805c842c in VOP_INACTIVE_APV (vop=0xffffffff80b8b220, a=0xffffff803ea099f0) at vnode_if.c:1863 #16 0xffffffff803cb435 in vinactive (vp=0xffffff007b0ff1d8, td=0xffffff007b2cd720) at vnode_if.h:807 #17 0xffffffff803cc788 in vput (vp=0xffffff007b0ff1d8) at /usr/src/sys/kern/vfs_subr.c:2257 #18 0xffffffff803c5a4f in dounmount (mp=0xffffff0002c438d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1333 #19 0xffffffff803c6058 in unmount (td=0xffffff007b2cd720, uap=0xffffff803ea09bf0) at /usr/src/sys/kern/vfs_mount.c:1174 #20 0xffffffff80598e7f in syscall (frame=0xffffff803ea09c80) at /usr/src/sys/amd64/amd64/trap.c:984 #21 0xffffffff8057f081 in Xfast_syscall () Happened at (or very close to!) the place of the original panic. :/ Regards, Thomas PS. I'll test Pawel's patch sometime after dinner. ;) From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 15:40:12 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DAA311065670; Thu, 30 Jul 2009 15:40:12 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id A7ECC8FC1A; Thu, 30 Jul 2009 15:40:11 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id SAA01707; Thu, 30 Jul 2009 18:40:09 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4A71BED8.7050300@freebsd.org> Date: Thu, 30 Jul 2009 18:40:08 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> In-Reply-To: <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exscape.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 15:40:13 -0000 on 30/07/2009 18:25 Thomas Backman said the following: > PS. I'll test Pawel's patch sometime after dinner. ;) I believe that you should get a perfect result with it. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 16:41:44 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04B3F1065673; Thu, 30 Jul 2009 16:41:44 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 757028FC0C; Thu, 30 Jul 2009 16:41:43 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:53925 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWYhC-0002Io-63; Thu, 30 Jul 2009 18:41:41 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 95CAB1734FD; Thu, 30 Jul 2009 18:41:32 +0200 (CEST) Message-Id: From: Thomas Backman To: Andriy Gapon In-Reply-To: <4A71BED8.7050300@freebsd.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 18:41:29 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> <4A71BED8.7050300@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWYhC-0002Io-63. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWYhC-0002Io-63 3faae42dab5224f65c9285e20f9272a3 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 16:41:44 -0000 On Jul 30, 2009, at 17:40, Andriy Gapon wrote: > on 30/07/2009 18:25 Thomas Backman said the following: >> PS. I'll test Pawel's patch sometime after dinner. ;) > > I believe that you should get a perfect result with it. > > -- > Andriy Gapon If I dare say it, you were right! I've been testing for about half an hour or so (probably a bit more) now. Still using DEBUG_VFS_LOCKS, and I've tried the test case several times, ran an initial backup (i.e. destroy target pool and send|recv the entire pool) and a few incrementals. Rebooted, tried it again. No panic, no problems! :) Let's hope it stays this way. So, in short: With that patch (copied here just in case: http://exscape.org/temp/zfs_vnops.working.patch ) and the libzfs patch linked previously, it appears zfs send/recv works plain fine. I have yet to try it with clone/promote and stuff, but since that gave the same panic that this solved, I'm hoping there will be no problems with that anymore. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 18:29:38 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66949106568A; Thu, 30 Jul 2009 18:29:38 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id D889D8FC0C; Thu, 30 Jul 2009 18:29:37 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:60383 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWaNc-0003pR-47; Thu, 30 Jul 2009 20:29:30 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 4138D17DFF2; Thu, 30 Jul 2009 20:29:30 +0200 (CEST) Message-Id: From: Thomas Backman To: Andriy Gapon In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 30 Jul 2009 20:29:27 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> <4A71BED8.7050300@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWaNc-0003pR-47. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWaNc-0003pR-47 102eee8d767124b80d8ac397f79c86ff Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 18:29:38 -0000 On Jul 30, 2009, at 18:41, Thomas Backman wrote: > On Jul 30, 2009, at 17:40, Andriy Gapon wrote: >> on 30/07/2009 18:25 Thomas Backman said the following: >>> PS. I'll test Pawel's patch sometime after dinner. ;) >> >> I believe that you should get a perfect result with it. >> >> -- Andriy Gapon > If I dare say it, you were right! I've been testing for about half > an hour or so (probably a bit more) now. > Still using DEBUG_VFS_LOCKS, and I've tried the test case several > times, ran an initial backup (i.e. destroy target pool and send|recv > the entire pool) and a few incrementals. Rebooted, tried it again. > No panic, no problems! :) > Let's hope it stays this way. > > So, in short: With that patch (copied here just in case: http://exscape.org/temp/zfs_vnops.working.patch > ) and the libzfs patch linked previously, it appears zfs send/recv > works plain fine. I have yet to try it with clone/promote and stuff, > but since that gave the same panic that this solved, I'm hoping > there will be no problems with that anymore. Arrrgh! I guess I spoke too soon after all... new panic yet again. :( *sigh* It feels as if this will never become stable right now. (Maybe that's because I've spent all day and most of yesterday too on this ;) Steps and panic info: (Prior to this, I tried a simple zfs promote on one of my clones, and then reverted it by promoting the other FS again, with no problems on running the backup script.) [root@chaos ~]# zfs destroy -r tank/testfs [root@chaos ~]# bash backup.sh backup (all output is from zfs, on zfs send -R -I old tank@new | zfs recv - Fvd slave) attempting destroy slave/testfs@backup-20090730-2009 success attempting destroy slave/testfs@backup-20090730-1823 success attempting destroy slave/testfs@backup-20090730-1801 success attempting destroy slave/testfs@backup-20090730-2011 success attempting destroy slave/testfs@backup-20090730-1827 success attempting destroy slave/testfs success receiving incremental stream of tank@backup-20090730-2012 into slave@backup-20090730-2012 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/tmp@backup-20090730-2012 into slave/tmp@backup-20090730-2012 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/var@backup-20090730-2012 into slave/var@backup-20090730-2012 received 32.6KB stream in 1 seconds (32.6KB/sec) receiving incremental stream of tank/var/log@backup-20090730-2012 into slave/var/log@backup-20090730-2012 received 298KB stream in 1 seconds (298KB/sec) receiving incremental stream of tank/var/crash@backup-20090730-2012 into slave/var/crash@backup-20090730-2012 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/root@backup-20090730-2012 into slave/root@backup-20090730-2012 [... panic here ...] Unread portion of the kernel message buffer:panic: solaris assert: ((zp)->z_vnode)->v_usecount > 0, file: /usr/src/sys/modules/zfs/../../ cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c, line: 920 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 zfsvfs_teardown() at zfsvfs_teardown+0x24d zfs_suspend_fs() at zfs_suspend_fs+0x2b zfs_ioc_recv() at zfs_ioc_recv+0x28b zfsdev_ioctl() at zfsdev_ioctl+0x8a devfs_ioctl_f() at devfs_ioctl_f+0x77 kern_ioctl() at kern_ioctl+0xf6 ioctl() at ioctl+0xfd syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = 0x7fffffff8ef8, rbp = 0x7fffffff9c30 --- KDB: enter: panic panic: from debugger #9 0xffffffff8057eda7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80609c44 "panic", msg=0xa
) at cpufunc.h:63 #11 0xffffffff8033abcb in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:558#12 0xffffffff80b0ec5d in zfsvfs_teardown () from /boot/kernel/zfs.ko#13 0x0000000000100000 in ?? () #14 0xffffff001bff0250 in ?? () #15 0xffffff001bff0000 in ?? () #16 0xffffff0008004000 in ?? () #17 0xffffff803e9747a0 in ?? () #18 0xffffff803e9747d0 in ?? () #19 0xffffff803e974770 in ?? () #20 0xffffff803e974740 in ?? () #21 0xffffffff80b0ecab in zfs_suspend_fs () from /boot/kernel/zfs.ko Previous frame inner to this frame (corrupt stack?) Unfortunately, I'm not sure I can reproduce this reliably, since it worked a bunch of times both before and after my previous mail. Oh, and I'm still using -DDEBUG=1 and DEBUG_VFS_LOCKS... If this isn't a new panic because of the changes, perhaps it was triggered now and never before because of the -DDEBUG? Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 30 18:42:19 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD5FA106566C; Thu, 30 Jul 2009 18:42:19 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 98D5F8FC16; Thu, 30 Jul 2009 18:42:18 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA05367; Thu, 30 Jul 2009 21:42:16 +0300 (EEST) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1] helo=edge.pp.kiev.ua) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1MWaZz-000368-Vz; Thu, 30 Jul 2009 21:42:16 +0300 Message-ID: <4A71E986.9010800@freebsd.org> Date: Thu, 30 Jul 2009 21:42:14 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090723) MIME-Version: 1.0 To: Thomas Backman References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> <4A71BED8.7050300@freebsd.org> In-Reply-To: X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 18:42:20 -0000 on 30/07/2009 21:29 Thomas Backman said the following: > > Unfortunately, I'm not sure I can reproduce this reliably, since it > worked a bunch of times both before and after my previous mail. > > Oh, and I'm still using -DDEBUG=1 and DEBUG_VFS_LOCKS... If this isn't a > new panic because of the changes, perhaps it was triggered now and never > before because of the -DDEBUG? Thomas, I am going on vacation, so no help from me for the next two weeks. Yes, if you get a panic in ASSERT in zfs code, then it's caught because of DEBUG=1. I can't say if you would get a different panic further on if DEBUG weren't enabled. Maybe Pawel can say if this ASSERT is a correct one (I've seen in the past some incorrect asserts, but not in ZFS code). -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 06:06:14 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8AEC2106566B; Fri, 31 Jul 2009 06:06:14 +0000 (UTC) (envelope-from acc@hexadecagram.org) Received: from mail.itproficiency.com (hexadecagram.org [166.70.126.65]) by mx1.freebsd.org (Postfix) with ESMTP id 296DA8FC0C; Fri, 31 Jul 2009 06:06:13 +0000 (UTC) (envelope-from acc@hexadecagram.org) Received: from localhost (unknown [127.0.0.2]) by mail.itproficiency.com (Postfix) with ESMTP id 4242DC106FD; Fri, 31 Jul 2009 00:06:13 -0600 (MDT) X-Virus-Scanned: amavisd-new at itproficiency.com Received: from mail.itproficiency.com ([127.0.0.2]) by localhost (mail.itproficiency.com [127.0.0.2]) (amavisd-new, port 10024) with LMTP id ccYsQIQ-M1uU; Fri, 31 Jul 2009 00:05:57 -0600 (MDT) Received: from ares.aegaeum.hexadecagram.org (ares.aegaeum.hexadecagram.org [192.168.133.220]) by mail.itproficiency.com (Postfix) with ESMTP id 214C5C0FF81; Fri, 31 Jul 2009 00:05:54 -0600 (MDT) Message-ID: <4A7289B9.2060907@hexadecagram.org> Date: Fri, 31 Jul 2009 00:05:45 -0600 From: Anthony Chavez Organization: hexadecagram.org User-Agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: freebsd-geom@freebsd.org References: <4A62E0CE.1000508@hexadecagram.org> <20090729140436.GG1586@garage.freebsd.pl> In-Reply-To: <20090729140436.GG1586@garage.freebsd.pl> X-Enigmail-Version: 0.95.7 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigB401B05976538240EEF8BBEA" Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: Re-starting a gjournal provider X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 06:06:14 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigB401B05976538240EEF8BBEA Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks very much for responding, Pawel. I'm moving this discussion to freebsd-geom, which is where I probably should have posted in the first place. Lack of sleep and coffee on Sunday morning were partly to blame, I'm sure. ;-) Pawel Jakub Dawidek wrote: > On Sun, Jul 19, 2009 at 03:01:02AM -0600, Anthony Chavez wrote: >> Hello freebsd-fs, >> >> I'm trying to get gjournal working on a "removable" hard disk. I use >> the term loosely, because I'm using a very simple eSATA enclosure: an >> AMS Venus DS5 [1]. >> >> If I swap out disks, atacontrol cap ad0 seems sufficient enough to >> detect the new drive: the reported device model, serial number, firmwa= re >> revision, and CHS values change as one would expect. >> >> My interpretation of [2] section 5.3 and gjournal(8) is that the >> following sequence of commands should ensure me that all write buffers= >> have been flushed and bring the system to a point where it is safe to >> remove a disk. >> >> sync; sync; sync >> gjournal sync >> umount /dev/ad0s1.journal >> gjournal stop ad0s1.journal >=20 > You should first unmount and then call 'gjournal sync'. Thank you for clarifying that. You mention this again later on in your response, and I respond below. >> However, once they are executed, /dev/ad0s1.journal disappears and whe= n >> I swap out the disk it doesn't come back. The only way I've found to >> bring it back is atacontrol detach ata0; atacontrol attach ata0, which= >> doesn't seem like a wise thing to do if I have another device on the >> same channel. >=20 > It doesn't come back because something (ATA layer?) doesn't properly > remove ad0 provider. When you remove the disk, /dev/ad0 should disappea= r > and reappear once you insert it again. >=20 > You can still do this trick after you insert the disk again so the GEOM= > can schedule retaste: >=20 > # true > /dev/ad0 Thank you for informing me of that trick. I tried using it after "gjournal stop" but unfortunately, nothing changed. My terminology might have been a bit off in my initial post (gjournal is still a bit new to me). So I will attempt to clarify a bit more. Here is an example of a typical session (the only difference this time being "gjournal sync" following umount as you prescribed). % sudo atacontrol info ata0 Master: ad0 SATA revision 1.x Slave: no device present % ls /dev/ad0* /dev/ad0 /dev/ad0s1 /dev/ad0s1.journal % mount | grep ad0s1 /dev/ad0s1.journal on /mnt/ad0s1 (ufs, local, gjournal) % ( subsh> set -o errexit subsh> sync subsh> sync subsh> sync subsh> sudo umount /dev/ad0s1.journal subsh> gjournal sync subsh> sudo gjournal stop ad0s1.journal subsh> ) % ls /dev/ad0* /dev/ad0 /dev/ad0s1 % sudo true \> /dev/ad0; ls /dev/ad0* /dev/ad0 /dev/ad0s1 % sudo true \> /dev/ad0s1; ls /dev/ad0* /dev/ad0 /dev/ad0s1 % sudo atacontrol detach ata0 && sudo atacontrol attach ata0 Master: ad0 SATA revision 1.x Slave: no device present % ls /dev/ad0* /dev/ad0 /dev/ad0s1 /dev/ad0s1.journal Here are the points to note. 1) When I physically remove a drive from the enclosure, /dev/ad0 does not disappear. /dev/ad0 *always* exists until I "atacontrol detach." Even when the device is powered off, /dev/ad0 continues to exist. 2) /dev/ad0s1.journal disappears when I "gjournal stop." /dev/ad0s1.journal is the device that, AFAIK, will only come back after "atacontrol detach ata0; atacontrol attach ata0". 3) When I swap drives, "atacontrol cap ad0" will produce a report for the newly-inserted drive. If I attempt to "atacontrol info ata0" before issuing that command, it continues to display the drive model and firmware revision from the drive that was previously inserted. However, "atacontrol cap" does not appear to provoke the return of /dev/ad0s1.journal. >> My question is, do I need to issue gjournal stop before I swap disks? >> And if so, is there any way that I can avoid the atacontrol >> detach/attach cycle that would need to take place before any mount is >> attempted so that /dev/ad0s1.journal appears (if in the drive inserted= >> at the time does in fact utilize gjournal; I may want to experiment wi= th >> having disks with either gjournal or soft updates)? This paragraph (above) and the one that that proceeded it in my original post contains the following 2 questions that remain unanswered (I've added another which was implied previously at best). 1) Is "atacontrol detach ata0 && atacontrol attach ata0" in fact a safe operation to perform in any circumstance? My better judgment has me thinking that the answer to this question is almost certainly "no." However, I am hypothesizing that it would safe enough if all devices on ata0 are properly unmounted first, but if I can avoid that, I will. It feels clumsy and seems to defeat the purpose of hot-swapping. 2) Is it *necessary* to "gjournal stop" before hot-swapping? In such a scenario, I would opt to simply "umount; gjournal sync," swap disks, and then "atacontrol cap ad0; mount" (or even just "mount"). It seems quite likely, however, that all drives that undergo this treatment would be *required* to have gjournal labels since /dev/ad0s1.journal would never disappear (although I've yet to actually test that). 3) If the answer to question 2 is "yes," then how can I handle the case of inserting a drive that does *not* have a gjournal label? >> And while I'm on the subject, are the (gjournal) syncs commands >> preceeding umount absolutely necessary in the case of removable media?= >=20 > 'gjournal sync' should follow unmount, not the other way around. And it= s > better to do it, but 'gjournal stop' should do the same. If that is indeed the case then the article I referenced as [2], "Implementing UFS Journaling on a Desktop PC," should be updated to reflect that ordering (section 5.3 prescribes a "umount" followed by "gjournal sync"). I'm submitting a PR that addresses this. In any case, the question I was asking here is actually twofold: 1) Is it really necessary to perform 3 "sync" commands before "umount"? Line 94 of src/sbin/umount/umount.c,v 1.45.20.1 has me thinking that the answer is "no," since it calls sync() itself, albeit only once. I got the idea for executing "sync" three times from /etc/rc.suspend. 2) Is it necessary to "gjournal sync" if I'm going to "gjournal stop" anyway? (You answered this one already.) Thank you for the assistance. --=20 Anthony Chavez http://hexadecagram.org/ mailto:acc@hexadecagram.org xmpp:acc@hexadecagram.org --------------enigB401B05976538240EEF8BBEA Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpyicAACgkQbZTbIaRBRXFQPgCfemQsbeTMnQsFfSkLIR7VMvjM f1UAmwWwuO8tnWGM3hmXOqp5F7nbFo3c =rY6f -----END PGP SIGNATURE----- --------------enigB401B05976538240EEF8BBEA-- From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 06:49:32 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AF47106564A; Fri, 31 Jul 2009 06:49:32 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id ADF7E8FC21; Fri, 31 Jul 2009 06:49:31 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 90F8E45CA0; Fri, 31 Jul 2009 08:49:29 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 1B41445685; Fri, 31 Jul 2009 08:49:24 +0200 (CEST) Date: Fri, 31 Jul 2009 08:49:48 +0200 From: Pawel Jakub Dawidek To: Anthony Chavez Message-ID: <20090731064948.GG1584@garage.freebsd.pl> References: <4A62E0CE.1000508@hexadecagram.org> <20090729140436.GG1586@garage.freebsd.pl> <4A7289B9.2060907@hexadecagram.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Sw7tCqrGA+HQ0/zt" Content-Disposition: inline In-Reply-To: <4A7289B9.2060907@hexadecagram.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Re-starting a gjournal provider X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 06:49:32 -0000 --Sw7tCqrGA+HQ0/zt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 31, 2009 at 12:05:45AM -0600, Anthony Chavez wrote: > > It doesn't come back because something (ATA layer?) doesn't properly > > remove ad0 provider. When you remove the disk, /dev/ad0 should disappear > > and reappear once you insert it again. > >=20 > > You can still do this trick after you insert the disk again so the GEOM > > can schedule retaste: > >=20 > > # true > /dev/ad0 >=20 > Thank you for informing me of that trick. I tried using it after > "gjournal stop" but unfortunately, nothing changed. This is because it should be /dev/ad0s1 and not /dev/ad0. Try with /dev/ad0s1. > Here are the points to note. >=20 > 1) When I physically remove a drive from the enclosure, /dev/ad0 does > not disappear. /dev/ad0 *always* exists until I "atacontrol detach." > Even when the device is powered off, /dev/ad0 continues to exist. This might be three things: 1. Your enclosure/controller doesn't report back about disk being removed. 2. Your enclosure does report back, but ATA ignores such report. This will be a bug in ATA. 3. Your controller doesn't support hot-swap or it supports warm-swap, which means you have to detach it by hand before removing it. > 2) /dev/ad0s1.journal disappears when I "gjournal stop." > /dev/ad0s1.journal is the device that, AFAIK, will only come back after > "atacontrol detach ata0; atacontrol attach ata0". It should also get back after 'true > /dev/ad0s1'. What this command do is to open provider for writing (it doesn't write anything). In GEOM it will trigger spoil event and then, once command completes, it will trigger retaste event. This mean that GEOM will inform gjournal to check /dev/ad0s1 again and this will allow gjournal to find its metadata and create /dev/ad0s1.journal once again. One more test would be in place. If you could try the command below before removing disk and after inserting different disk: # diskinfo -v /dev/ad0 If it shows exactly the same in two cases, it means that it is not aware that disk was replaced and detach/attach cycle is needed. > 1) Is "atacontrol detach ata0 && atacontrol attach ata0" in fact a safe > operation to perform in any circumstance? >=20 > My better judgment has me thinking that the answer to this question is > almost certainly "no." However, I am hypothesizing that it would safe > enough if all devices on ata0 are properly unmounted first, but if I can > avoid that, I will. It feels clumsy and seems to defeat the purpose of > hot-swapping. It should be safe, but there were plenty of bugs related to disappearing disk from under mount file system, etc. If nothing is mounted you should be fine (if there are no ATA bugs in this area). But for full hot-swap the disk controller should discover disk being removed and ATA code should remove it from /dev/. > 2) Is it *necessary* to "gjournal stop" before hot-swapping? >=20 > In such a scenario, I would opt to simply "umount; gjournal sync," swap > disks, and then "atacontrol cap ad0; mount" (or even just "mount"). It > seems quite likely, however, that all drives that undergo this treatment > would be *required* to have gjournal labels since /dev/ad0s1.journal > would never disappear (although I've yet to actually test that). I'd go with 'umount; gjournal stop' and drop 'gjournal sync'. Controler should inform ATA that disk is gone. ATA should inform GEOM that ad0 is gone. If that would be the case, simple 'umount; gjournal sync' will be enough. But because it isn't the case, you have to stop gjournal and detach ad0. > 3) If the answer to question 2 is "yes," then how can I handle the case > of inserting a drive that does *not* have a gjournal label? There's nothing special here. Let's see how diskinfo test will go first. > 1) Is it really necessary to perform 3 "sync" commands before "umount"? >=20 > Line 94 of src/sbin/umount/umount.c,v 1.45.20.1 has me thinking that the > answer is "no," since it calls sync() itself, albeit only once. I got > the idea for executing "sync" three times from /etc/rc.suspend. The idea is that unmount should take case of syncing data. There should be not need for even one sync. It is called "just in case". > 2) Is it necessary to "gjournal sync" if I'm going to "gjournal stop" > anyway? (You answered this one already.) No, stop should be sufficient. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --Sw7tCqrGA+HQ0/zt Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKcpQMForvXbEpPzQRAvBrAJ4l7CA4pYXbRsfVDqT4vdzeSqIYggCdHrxs FlRRqlOfTP6dD/n2dVzEBsQ= =dGOD -----END PGP SIGNATURE----- --Sw7tCqrGA+HQ0/zt-- From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 09:05:26 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE7FF106566B; Fri, 31 Jul 2009 09:05:26 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 3699B8FC0A; Fri, 31 Jul 2009 09:05:26 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:43858 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWo2z-0007k4-4T; Fri, 31 Jul 2009 11:05:12 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 3848432BFC; Fri, 31 Jul 2009 11:05:03 +0200 (CEST) Message-Id: From: Thomas Backman To: Thomas Backman In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 31 Jul 2009 11:05:01 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> <4A71BED8.7050300@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWo2z-0007k4-4T. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWo2z-0007k4-4T 7f1a4434528ae9fc72d402c409330e44 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 09:05:27 -0000 On Jul 30, 2009, at 20:29, Thomas Backman wrote: > On Jul 30, 2009, at 18:41, Thomas Backman wrote: > >> On Jul 30, 2009, at 17:40, Andriy Gapon wrote: >>> on 30/07/2009 18:25 Thomas Backman said the following: >>>> PS. I'll test Pawel's patch sometime after dinner. ;) >>> >>> I believe that you should get a perfect result with it. >>> >>> -- Andriy Gapon >> If I dare say it, you were right! I've been testing for about half >> an hour or so (probably a bit more) now. >> Still using DEBUG_VFS_LOCKS, and I've tried the test case several >> times, ran an initial backup (i.e. destroy target pool and send| >> recv the entire pool) and a few incrementals. Rebooted, tried it >> again. No panic, no problems! :) >> Let's hope it stays this way. >> >> So, in short: With that patch (copied here just in case: http://exscape.org/temp/zfs_vnops.working.patch >> ) and the libzfs patch linked previously, it appears zfs send/recv >> works plain fine. I have yet to try it with clone/promote and >> stuff, but since that gave the same panic that this solved, I'm >> hoping there will be no problems with that anymore. > > Arrrgh! > I guess I spoke too soon after all... new panic yet again. :( > *sigh* It feels as if this will never become stable right now. > (Maybe that's because I've spent all day and most of yesterday too > on this ;) > > Steps and panic info: > > (Prior to this, I tried a simple zfs promote on one of my clones, > and then reverted it by promoting the other FS again, with no > problems on running the backup script.) > > [root@chaos ~]# zfs destroy -r tank/testfs > [root@chaos ~]# bash backup.sh backup > (all output is from zfs, on zfs send -R -I old tank@new | zfs recv - > Fvd slave) > > attempting destroy slave/testfs@backup-20090730-2009 > success > attempting destroy slave/testfs@backup-20090730-1823 > success > attempting destroy slave/testfs@backup-20090730-1801 > success > attempting destroy slave/testfs@backup-20090730-2011 > success > attempting destroy slave/testfs@backup-20090730-1827 > success > attempting destroy slave/testfs > success > receiving incremental stream of tank@backup-20090730-2012 into > slave@backup-20090730-2012 > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of tank/tmp@backup-20090730-2012 into > slave/tmp@backup-20090730-2012 > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of tank/var@backup-20090730-2012 into > slave/var@backup-20090730-2012 > received 32.6KB stream in 1 seconds (32.6KB/sec) > receiving incremental stream of tank/var/log@backup-20090730-2012 > into slave/var/log@backup-20090730-2012 > received 298KB stream in 1 seconds (298KB/sec) > receiving incremental stream of tank/var/crash@backup-20090730-2012 > into slave/var/crash@backup-20090730-2012 > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of tank/root@backup-20090730-2012 into > slave/root@backup-20090730-2012 > [... panic here ...] > > Unread portion of the kernel message buffer:panic: solaris assert: > ((zp)->z_vnode)->v_usecount > 0, file: /usr/src/sys/modules/ > zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c, > line: 920 > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x182 > zfsvfs_teardown() at zfsvfs_teardown+0x24d > zfs_suspend_fs() at zfs_suspend_fs+0x2b > zfs_ioc_recv() at zfs_ioc_recv+0x28b > zfsdev_ioctl() at zfsdev_ioctl+0x8a > devfs_ioctl_f() at devfs_ioctl_f+0x77 > kern_ioctl() at kern_ioctl+0xf6 > ioctl() at ioctl+0xfd > syscall() at syscall+0x28f > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = > 0x7fffffff8ef8, rbp = 0x7fffffff9c30 --- > KDB: enter: panic > panic: from debugger > > #9 0xffffffff8057eda7 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80609c44 "panic", > msg=0xa
) at cpufunc.h:63 > #11 0xffffffff8033abcb in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:558#12 0xffffffff80b0ec5d > in zfsvfs_teardown () from /boot/kernel/zfs.ko#13 0x0000000000100000 > in ?? () > #14 0xffffff001bff0250 in ?? () > #15 0xffffff001bff0000 in ?? () > #16 0xffffff0008004000 in ?? () > #17 0xffffff803e9747a0 in ?? () > #18 0xffffff803e9747d0 in ?? () > #19 0xffffff803e974770 in ?? () > #20 0xffffff803e974740 in ?? () > #21 0xffffffff80b0ecab in zfs_suspend_fs () from /boot/kernel/zfs.ko > Previous frame inner to this frame (corrupt stack?) > > Unfortunately, I'm not sure I can reproduce this reliably, since it > worked a bunch of times both before and after my previous mail. > > Oh, and I'm still using -DDEBUG=1 and DEBUG_VFS_LOCKS... If this > isn't a new panic because of the changes, perhaps it was triggered > now and never before because of the -DDEBUG? > > Regards, > Thomas I'm able to reliably reproduce this panic, by having zfs recv destroy a filesystem on the receiving end. 1) Use DDEBUG=1, I guess 2) Create a FS on the source pool you don't care about: zfs create -o mountpoint=/testfs source/testfs 3) Clone a pool to another: zfs snapshot -r source@snap && zfs send -R source@snap | zfs recv -Fvd target 4) zfs destroy -r source/testfs 4) zfs snapshot -r source@snap2 && zfs send -R -I snap source@snap2 | zfs recv -Fvd target 5) ^ Panic while receiving the FS the destroyed one is mounted under. In my case, this was tank/root three times out of three; I then tried creating testfs under /tmp (tank/tmp/testfs), *mounting* it under /usr/ testfs, and it panics on receiving tank/usr: attempting destroy slave/tmp/testfs@backup-20090731-1100 success attempting destroy slave/tmp/testfs@backup-20090731-1036 success attempting destroy slave/tmp/testfs success ... receiving incremental stream of tank/tmp@backup-20090731-1101 into slave/tmp@backup-20090731-1101 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/root@backup-20090731-1101 into slave/root@backup-20090731-1101 received 58.3KB stream in 1 seconds (58.3KB/sec) receiving incremental stream of tank/usr@backup-20090731-1101 into slave/usr@backup-20090731-1101 ... panic here, no more output Same backtrace/assert as above. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 11:45:15 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 56E101065670; Fri, 31 Jul 2009 11:45:15 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id C71FD8FC1A; Fri, 31 Jul 2009 11:45:14 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n6VBjA1Y052822; Fri, 31 Jul 2009 06:45:10 -0500 (CDT) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=OmqSEcYPqBrAo7Dky/OQC1p1Uw0X+d7ZeoSfNvGazbfwmRMycSDR4HcWSAtddY1dF MSjjUHBGRVJBCnla0ZkODSWCRhhJrNqngsHjwpi0OlZ1TqnX7O/1SUfjAtfV5AgSDNL MabJPLXt62vspspPSVuOSRcBdCO2dzk5bEZTKOA= Message-ID: <4A72D946.4090401@jrv.org> Date: Fri, 31 Jul 2009 06:45:10 -0500 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.22 (Macintosh/20090605) MIME-Version: 1.0 To: Andriy Gapon References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B2DA.9060902@freebsd.org> In-Reply-To: <4A71B2DA.9060902@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek , Thomas Backman Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 11:45:15 -0000 Andriy Gapon wrote: > on 30/07/2009 17:39 Thomas Backman said the following: > >> Or, in patch form (I think the intendation screws the patch up as linked >> there): >> http://exscape.org/temp/libzfs_sendrecv.patch >> > > One comment on the patch - I personally don't like bit-wise xor in a logical > expression. But if otherwise the expression would be huge and ugly, then OK. > If you're going to code an XOR, use an XOR. Don' make the reader untangle code to figure out that that some other code is really just an XOR. However I think I was trying to handle two cases that can't happen: the top filesystem cannot be renamed to somewhere else in the pool, and no other filesystem can be renamed to the root. So the new version of the patch below needs no XOR. Without this or something like it you can't replicate an entire pool, i.e. zfs send -R -I @yesterday pool@today | ssh backup zfs recv -vF -d pool dumps core from the strccmp(0, 0) in the original code below. Index: cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c =================================================================== --- cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c (revision 192136) +++ cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c (working copy) @@ -1126,7 +1126,7 @@ uint64_t originguid = 0; uint64_t stream_originguid = 0; uint64_t parent_fromsnap_guid, stream_parent_fromsnap_guid; - char *fsname, *stream_fsname; + char *fsname, *stream_fsname, *p1, *p2; nextfselem = nvlist_next_nvpair(local_nv, fselem); @@ -1295,10 +1295,11 @@ "parentfromsnap", &stream_parent_fromsnap_guid)); /* check for rename */ + p1 = strrchr(fsname, '/'); + p2 = strrchr(stream_fsname, '/'); if ((stream_parent_fromsnap_guid != 0 && stream_parent_fromsnap_guid != parent_fromsnap_guid) || - strcmp(strrchr(fsname, '/'), - strrchr(stream_fsname, '/')) != 0) { + (p1 != NULL && p2 != NULL && strcmp (p1, p2) != 0)) { nvlist_t *parent; char tryname[ZFS_MAXNAMELEN]; @@ -1317,7 +1318,7 @@ VERIFY(0 == nvlist_lookup_string(parent, "name", &pname)); (void) snprintf(tryname, sizeof (tryname), - "%s%s", pname, strrchr(stream_fsname, '/')); + "%s%s", pname, p2 ? p2 : ""); } else { tryname[0] = '\0'; if (flags.verbose) { From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 12:27:51 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 82AE71065686; Fri, 31 Jul 2009 12:27:51 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 001FA8FC16; Fri, 31 Jul 2009 12:27:50 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:43667 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWrCi-0004oB-47; Fri, 31 Jul 2009 14:27:22 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 7866A3611B; Fri, 31 Jul 2009 14:27:17 +0200 (CEST) Message-Id: <76338BEC-9B85-4AD1-B04B-850486866F3B@exscape.org> From: Thomas Backman To: James R. Van Artsdalen In-Reply-To: <4A72D946.4090401@jrv.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 31 Jul 2009 14:27:15 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B2DA.9060902@freebsd.org> <4A72D946.4090 401@jrv.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWrCi-0004oB-47. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWrCi-0004oB-47 67b2a4853640d5ae6f4cd19fe5a4945e Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 12:27:52 -0000 On Jul 31, 2009, at 13:45, James R. Van Artsdalen wrote: > Andriy Gapon wrote: >> on 30/07/2009 17:39 Thomas Backman said the following: >> >>> Or, in patch form (I think the intendation screws the patch up as >>> linked >>> there): >>> http://exscape.org/temp/libzfs_sendrecv.patch >>> >> >> One comment on the patch - I personally don't like bit-wise xor in >> a logical >> expression. But if otherwise the expression would be huge and ugly, >> then OK. >> > > If you're going to code an XOR, use an XOR. > Don' make the reader untangle code to figure out that that some other > code is really just an XOR. > > However I think I was trying to handle two cases that can't happen: > the > top filesystem cannot be renamed to somewhere else in the pool, and no > other filesystem can be renamed to the root. So the new version of > the > patch below needs no XOR. > > Without this or something like it you can't replicate an entire > pool, i.e. > > zfs send -R -I @yesterday pool@today | ssh backup zfs recv -vF -d > pool > > dumps core from the strccmp(0, 0) in the original code below. > > > Index: cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c > =================================================================== > ... Nice job, thanks :) Just wanted to chime in and say that your new patch seems to work just as well as the previous one. I hope you don't mind me hosting this too (I had to apply it manually thanks to spacing... I think it's my mail client not being very nice at retaining tabs/spaces)... Straight from svn diff: http://exscape.org/temp/libzfs_sendrecv.new.patch BTW (maybe not on topic for this mail, but for this thread), I've created a test case to reproduce the new panic (every time). It happens with -DDEBUG=1, after destroying a filesystem and then doing an incremental backup. Currently recompiling world/kernel on a second box to reproduce before I post that. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 13:51:45 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B760C1065689 for ; Fri, 31 Jul 2009 13:51:45 +0000 (UTC) (envelope-from bf1783@googlemail.com) Received: from mail-fx0-f210.google.com (mail-fx0-f210.google.com [209.85.220.210]) by mx1.freebsd.org (Postfix) with ESMTP id 3BCA78FC2B for ; Fri, 31 Jul 2009 13:51:45 +0000 (UTC) (envelope-from bf1783@googlemail.com) Received: by fxm6 with SMTP id 6so157470fxm.43 for ; Fri, 31 Jul 2009 06:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=bs3ATdoJfJLZzaAfLz1Sbb8lczyx/rDMuuo6u5N89SQ=; b=EfLsFd3NOwR6hNsNAfA/eC059qa7ArCuJWMcQhBa2I/dHDVFByXHui/6Lin8MVhZ3A 7uCTi+9gHiETyS8xGHb3RoT5k/nDTIuhzCk0H2qr2KmPHZhqtECNF6+oj13dc5Q+ZO05 Vkm3j06Rmi4/ceocrAZ/YYaKJT+Ci4f0RD5xE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=kNoDGyjtik9DR22jJWoRRCRC9HcSLgkWw5+lPyH9aBR8VO/PJmFB6YnYxZPu1vbGsd PSqpOsyoghh0GWlOCD6w0VIkR1YFq3yituTJj4i225HtL1BhgwM/eadCJIaJkxcAUPL0 WrsTVGr9aFCouvdY14uhZVWfXxxyinlKCmEdw= MIME-Version: 1.0 Received: by 10.239.167.212 with SMTP id h20mr241375hbe.68.1249046977816; Fri, 31 Jul 2009 06:29:37 -0700 (PDT) In-Reply-To: <26ddd1750907260719x761a1c94r27c572ab1ff6a582@mail.gmail.com> References: <26ddd1750907260719x761a1c94r27c572ab1ff6a582@mail.gmail.com> Date: Fri, 31 Jul 2009 13:29:37 +0000 Message-ID: From: "b. f." To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, Maxim Khitrov Subject: Re: UFS2 tuning for heterogeneous 4TB file system X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 13:51:46 -0000 On 7/26/09, Maxim Khitrov wrote: > On Sun, Jul 26, 2009 at 3:56 AM, b. f. wrote: >>>The file system in question will not have a common file size (which is >>>what, as I understand, bytes per inode should be tuned for). There >>>will be many small files (< 10 KB) and many large ones (> 500 MB). A >>>similar, in terms of content, 2TB ntfs file system on another server >>>has an average file size of about 26 MB with 59,246 files. >> >> Ordinarily, it may have a large variation in file sizes, but can you >> intervene, and segregate large and small files in separate >> filesystems, so that you can optimize the settings for each >> independently? > > That's a good idea, but the problem is that this raid array will grow > in the future as I add additional drives. As far as I know, a > partition can be expanded using growfs, but it cannot be moved to a > higher address (with any "standard" tools). So if I create two > separate partitions for different file types, the first partition will > have to remain a fixed size. That would be problematic, since I cannot > easily predict how much space it would need initially and for all > future purposes (enough to store all the files, yet not waste space > that could otherwise be used for the second partition). > Perhaps gconcat(8), gmirror(8), or vinum(4) will solve your problem here. I think there are other tools as well. >>>Ideally, I would prefer that small files do not waste more than 4 KB >>>of space, which is what you have with ntfs. At the same time, having >>>fsck running for days after an unclean shutdown is also not a good >>>option (I always disable background checking). From what I've gathered >>>so far, the two requirements are at the opposite ends in terms of file >>>system optimization. >> >> I gather you are trying to be conservative, but have you considered >> using gjournal(8)? At least for the filesystems with many small >> files? In that way, you could safely avoid the need for most if not >> all use of fsck(8), and, as an adjunct benefit, you would be able to >> operate on the small files more quickly: >> >> http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064043.html >> http://www.freebsd.org/doc/en_US.ISO8859-1/articles/gjournal-desktop/article.html >> >> gjournal has a lower overhead than ZFS, and has proven to be fairly >> reliable. Also, you can always unhook it and revert to plain UFS >> mounts easily. >> >> b. >> > > Just fairly reliable? :) > Well, I'm not going to promise the sun, the moon, and the stars. It has worked for me (better than softupdates, I might add) under my more modest workloads. > I've done a bit of reading on gjournal and the main thing that's > preventing me from using it is the recency of implementation. I've had > a number of FreeBSD servers go down in the past due to power outages > and SoftUpdates with foreground fsck have never failed me. I have > never had a corrupt ufs2 partition, which is not something I can say > about a few linux servers with ext3. > > Have there been any serious studies into how gjournal and SU deal with > power outages? By that I mean taking two identical machines, issuing > write operations, yanking the power cords, and then watching both > systems recover? I'm sure that gjournal will take less time to reboot, > but if this experiment is repeated a few hundred times I wonder what > the corruption statistics would be. Is there ever a case, for > instance, when the journal itself becomes corrupt because the power > was pulled in the middle of a metadata flush? > I'm not aware of any such tests, but I wouldn't be surprised if pjd@ or someone else who was interested in using gjournal(8) in a demanding environment had made some. I'll cc freebsd-fs@, because some of them may not monitor freebsd-questions. Perhaps someone there has some advice. You might also try asking on freebsd-geom@. Regards, b. > Basically, I have no experience with gjournal, poor experience with > other journaled file systems, and no real comparison between > reliability characteristics of gjournal and SoftUpdates, which have > served me very well in the past. > > - Max > From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 14:53:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 428821065673 for ; Fri, 31 Jul 2009 14:53:39 +0000 (UTC) (envelope-from jensrasmus@gmail.com) Received: from mail-bw0-f206.google.com (mail-bw0-f206.google.com [209.85.218.206]) by mx1.freebsd.org (Postfix) with ESMTP id C2DA48FC16 for ; Fri, 31 Jul 2009 14:53:38 +0000 (UTC) (envelope-from jensrasmus@gmail.com) Received: by bwz2 with SMTP id 2so1179145bwz.43 for ; Fri, 31 Jul 2009 07:53:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=s3cByMnu9Zkp8cC3tn63a2Ebn3jeMpPJGs7WAvHpgdA=; b=mwajESYFb317gE6FfNxlI07Gd7pdrf5S7JpWQNSrPxvuK33rQX1Y0rh8I59UwQS6ea 2+qhY8Lrm6GuMQm5pwt5CsScqoLuWGD7oFYksUtIoZ0Ptvh224tk6GXR28EHI4rTk7MV hfI6A0+9swYbw6zLQIeB32s3ajcti5UOhjtMI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=qn35a9nubGQ85Ph00uxvy6ywqeExEOCttbXOMqTLEyxgHiiUbGnpUotet+1wZITL+w ZLbip9wx2hAZj3NFJwPgxTHxz2NTZaDFm8Yi1cDvc/tN+N6RyxeTbKtoDUIvClF80sDs RPqW8+F5+umiuvHcLJKxPSSd9Yjaj3ZYss8OA= MIME-Version: 1.0 Received: by 10.204.118.134 with SMTP id v6mr2945340bkq.2.1249050340168; Fri, 31 Jul 2009 07:25:40 -0700 (PDT) Date: Fri, 31 Jul 2009 16:25:40 +0200 Message-ID: <63e02e980907310725t2b38d1d3iff66aca3948ac8dd@mail.gmail.com> From: Jens Rasmus Liland To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: How do I mount an external ntfs formatted harddisk manually and through /etc/fstab? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 14:53:39 -0000 Hi, How do I mount an NTFS formatted external harddisk plugged into the computer using a usb cable? And what do i write in the /etc/fstab after being able to successfully mount it manually? I have some blurry understanding after reading a bit in handbook that the harddisk's NTFS partition is at /dev/da0s1 by default. I have installed ntfs-3g from ports. /Rasmus From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 17:10:37 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 26671106566C; Fri, 31 Jul 2009 17:10:37 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 9653A8FC26; Fri, 31 Jul 2009 17:10:36 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:42358 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWvcO-0001VQ-4K; Fri, 31 Jul 2009 19:10:10 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 051CAE5B4; Fri, 31 Jul 2009 19:10:06 +0200 (CEST) Message-Id: From: Thomas Backman To: Pawel Jakub Dawidek In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 31 Jul 2009 19:10:03 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> <4A71BED8.7050300@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWvcO-0001VQ-4K. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWvcO-0001VQ-4K dd068384b9ef3b915ada255a211abdc2 Cc: freebsd-fs@freebsd.org, FreeBSD current , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 17:10:37 -0000 On Jul 30, 2009, at 20:29, Thomas Backman wrote: > On Jul 30, 2009, at 18:41, Thomas Backman wrote: > >> On Jul 30, 2009, at 17:40, Andriy Gapon wrote: >>> on 30/07/2009 18:25 Thomas Backman said the following: >>>> PS. I'll test Pawel's patch sometime after dinner. ;) >>> >>> I believe that you should get a perfect result with it. >>> >>> -- Andriy Gapon >> If I dare say it, you were right! I've been testing for about half >> an hour or so (probably a bit more) now. >> Still using DEBUG_VFS_LOCKS, and I've tried the test case several >> times, ran an initial backup (i.e. destroy target pool and send| >> recv the entire pool) and a few incrementals. Rebooted, tried it >> again. No panic, no problems! :) >> Let's hope it stays this way. >> >> So, in short: With that patch (copied here just in case: http://exscape.org/temp/zfs_vnops.working.patch >> ) and the libzfs patch linked previously, it appears zfs send/recv >> works plain fine. I have yet to try it with clone/promote and >> stuff, but since that gave the same panic that this solved, I'm >> hoping there will be no problems with that anymore. > > Arrrgh! > I guess I spoke too soon after all... new panic yet again. :( > *sigh* It feels as if this will never become stable right now. > (Maybe that's because I've spent all day and most of yesterday too > on this ;) > > [... same panic as I'm posting in the reply below snipped ...] > > Unfortunately, I'm not sure I can reproduce this reliably, since it > worked a bunch of times both before and after my previous mail. > > Oh, and I'm still using -DDEBUG=1 and DEBUG_VFS_LOCKS... If this > isn't a new panic because of the changes, perhaps it was triggered > now and never before because of the -DDEBUG? OK, I created a "test case" that triggers this panic for me every time, and reproduced it on another machine, so it should, uh, "work" for anyone reading this as well. Here are my patches, and the script used to reproduce the panic: (This assumes that you've got a clean SVN/cvsup source tree. If you have any of the patches mentioned below, remove them from the .patch first.) http://exscape.org/temp/zfs_destroy_panic_patches.patch (contains: James R. Van Artsdalen's libzfs_sendrecv patch that makes it not coredump(...), activating ZFS debugging (-DDEBUG=1), and Pawel's zfs_vnops.c patch.) http://exscape.org/temp/zfs_destroy_panic.sh (needs bash and 200MB free on your /root/-containing FS, unless you change the variables at the top; usage: "bash ...sh crash") You'll need to rebuild zfs.ko and libzfs, and if you use zfs.ko already, of course, reboot. (The libzfs patch can be installed and used without rebooting.) 1) cd /usr/src; fetch http://exscape.org/temp/zfs_destroy_panic_patches.patch && patch < zfs_destroy_panic_patches.patch 2) cd /usr/src/cddl/lib/libzfs/ ; make && make install 3) cd /usr/src/sys/modules/zfs ; make && make install 3b) (reboot, or kldload zfs) 4) fetch http://exscape.org/temp/zfs_destroy_panic.sh && bash zfs_destroy_panic.sh crash My output (snipped for brevity, most is useless stuff from dd, etc.): (I prepended a >> to output written by my script; the rest is from zfs. This isn't in the script itself.) >> Creating pools >> Creating filesystems >> Creating snapshot(s) >> Doing initial clone to slave pool receiving full stream of crashtestmaster@backup-20090731-185218 into crashtestslave@backup-20090731-185218 received 15.0KB stream in 1 seconds (15.0KB/sec) receiving full stream of crashtestmaster/ testroot@backup-20090731-185218 into crashtestslave/ testroot@backup-20090731-185218 received 15.0KB stream in 1 seconds (15.0KB/sec) receiving full stream of crashtestmaster/testroot/ testfs@backup-20090731-185218 into crashtestslave/testroot/ testfs@backup-20090731-185218 received 1.02MB stream in 1 seconds (1.02MB/sec) >> Initial step done! >> Destroying testfs >> Taking snapshots >> Starting backup... sending from @backup-20090731-185218 to crashtestmaster@backup-20090731-185226-11214-7776 sending from @backup-20090731-185218 to crashtestmaster/ testroot@backup-20090731-185226-11214-7776 attempting destroy crashtestslave/testroot/testfs@backup-20090731-185218 success attempting destroy crashtestslave/testroot/testfs success receiving incremental stream of crashtestmaster@backup-20090731-185226-11214-7776 into crashtestslave@backup-20090731-185226-11214-7776 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of crashtestmaster/ testroot@backup-20090731-185226-11214-7776 into crashtestslave/ testroot@backup-20090731-185226-11214-7776 [... panic, no no more output ...] DDB info, etc (from the original box; not the same run as above, but the same panic, so...): Unread portion of the kernel message buffer: panic: solaris assert: ((zp)->z_vnode)->v_usecount > 0, file: /usr/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ zfs_vfsops.c, line: 920 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 zfsvfs_teardown() at zfsvfs_teardown+0x24d zfs_suspend_fs() at zfs_suspend_fs+0x2b zfs_ioc_recv() at zfs_ioc_recv+0x28b zfsdev_ioctl() at zfsdev_ioctl+0x8a devfs_ioctl_f() at devfs_ioctl_f+0x77 kern_ioctl() at kern_ioctl+0xf6 ioctl() at ioctl+0xfd syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = 0x7fffffff8ee8, rbp = 0x7fffffff9c20 --- KDB: enter: panic panic: from debugger cpuid = 0 Uptime: 25m47s Physical memory: 2030 MB Dumping 1663 MB: ... #11 0xffffffff8033abcb in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:558 #12 0xffffffff80b0ec5d in zfsvfs_teardown () from /boot/kernel/zfs.ko #13 0x0000000000100000 in ?? () #14 0xffffff0048a7e250 in ?? () #15 0xffffff0048a7e000 in ?? () #16 0xffffff00063c0000 in ?? () #17 0xffffff803e8f27a0 in ?? () #18 0xffffff803e8f27d0 in ?? () #19 0xffffff803e8f2770 in ?? () #20 0xffffff803e8f2740 in ?? () #21 0xffffffff80b0ecab in zfs_suspend_fs () from /boot/kernel/zfs.ko Previous frame inner to this frame (corrupt stack?) I commented out -DDEBUG=1 and rebuilt+installed just the zfs module, and the panic appears to be gone. With DEBUG, it panicked every time (and I tried it at least 4-5 times). Without, it has worked flawlessly three times in a row, as has my regular backup. So, the big, TL;DR question is: is the ASSERT() unnecessary, as Andriy proposed it *might* be, or is this a real issue that actually needs fixing? It doesn't feel right to just ignore a potential bug by ignoring a failed assertion... Pawel? Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Sat Aug 1 03:18:30 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 846CD1065670 for ; Sat, 1 Aug 2009 03:18:30 +0000 (UTC) (envelope-from ludwigp@chip-web.com) Received: from toy2.chip-web.com (adsl-63-195-43-50.dsl.snfc21.pacbell.net [63.195.43.50]) by mx1.freebsd.org (Postfix) with SMTP id CAEC68FC08 for ; Sat, 1 Aug 2009 03:18:29 +0000 (UTC) (envelope-from ludwigp@chip-web.com) Received: (qmail 85207 invoked from network); 1 Aug 2009 01:53:56 -0000 Received: from localhost.chip-web.com (HELO ?127.0.0.1?) (ludwigp@127.0.0.1) by localhost.chip-web.com with SMTP; 1 Aug 2009 01:53:56 -0000 Message-ID: <4A73A096.5050106@chip-web.com> Date: Fri, 31 Jul 2009 18:55:34 -0700 From: Ludwig Pummer User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: Simun Mikecin References: <4A712290.9030308@chip-web.com> <46899.11156.qm@web37301.mail.mud.yahoo.com> <4A714B03.6050704@chip-web.com> In-Reply-To: <4A714B03.6050704@chip-web.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS raidz1 pool unavailable from losing 1 device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Aug 2009 03:18:30 -0000 Ludwig Pummer wrote: > Simun Mikecin wrote: >> Ludwin Pummer wrote: >> >> >>> My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded >>> from 6.4-STABLE from last year. I just set up a ZFS raidz volume to >>> replace a graid5 volume I had been using. I had it successfully set >>> up using partitions across 4 disks, ad{6,8,10,12}s1e. Then I wanted >>> to expand the raidz volume by merging the space from the adjacent >>> disk partition. I thought I could just fail out the partition device >>> in ZFS, edit the bsdlabel, and re-add the larger partition, ZFS >>> would resilver, repeat until done. That's when I found out that ZFS >>> doesn't let you fail out a device in a raidz volume. No big deal, I >>> thought, I'll just go to single user mode and mess with the >>> partition when ZFS isn't looking. When it comes back up it should >>> notice that one of the device is gone, I can do a 'zfs replace' and >>> continue my plan. >>> >>> Well, after rebooting to single user mode, combining partitions >>> ad12s1d and ad12s1e (removed the d partiton), "zfs volinit", then >>> "zpool status" just hung (Ctrl-C didn't kill it, so I rebooted). I >>> thought this was a bit odd so I thought perhaps ZFS is confused by >>> the ZFS metadata left on ad12s1e, so I blanked it out with "dd". >>> That didn't help. I changed the name of the partition to ad12s1d >>> thinking perhaps that would help. After that, "zfs volinit; zfs >>> mount -a; zpool status" showed my raidz pool UNAVAIL with the >>> message "insufficient replicas", ad{6,8,10}s1e ONLINE, and ad12s1e >>> UNAVAIL "cannot open", and a more detailed message pointing me to >>> http://www.sun.com/msg/ZFS-8000-3C. I tried doing a "zpool replace >>> storage ad12s1e ad12s1d" but it refused, saying my zpool ("storage") >>> was unavailable. Ditto for pretty much every zpool command I tried. >>> "zpool clear" gave me a "permission denied" error. >>> >> >> Was your pool imported while you were repartitioning in single user >> mode? >> > Yes, I guess you could say it was. ZFS wasn't loaded while I was doing > the repartitioning, though. > > --Ludwig > Well, I figured out my problem. I didn't actually have a raidz1 volume. I missed the magic word "raidz" when I performed the "zpool create" so I created a JBOD. Removing one disk legitmately destroyed my zpool :( --Ludwig From owner-freebsd-fs@FreeBSD.ORG Sat Aug 1 08:39:47 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 222871065675; Sat, 1 Aug 2009 08:39:47 +0000 (UTC) (envelope-from davidn04@gmail.com) Received: from mail-qy0-f191.google.com (mail-qy0-f191.google.com [209.85.221.191]) by mx1.freebsd.org (Postfix) with ESMTP id BE6CA8FC17; Sat, 1 Aug 2009 08:39:46 +0000 (UTC) (envelope-from davidn04@gmail.com) Received: by qyk29 with SMTP id 29so3344338qyk.3 for ; Sat, 01 Aug 2009 01:39:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=Os9Y3dWtQuy3/fpqnvID6JExqU/NvApiQ7nCLEYRybY=; b=czgQC+CZMnkWEizSUCcuHFHej/mnS0oUYdhVLWmm7z/JAUZPw+1j7Luug3EFAOSMN3 2pnzFCF4KoaX5bH2yJEWPcutC5v7zyyaDeGZDMoZZpUSr6RCUHsQOVHKhuS8CjAbpswS y0M2YGM/5FXP3QujjOHyD5uJinZVz5SIYP8Sw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=O8tTO5oq5/bdcgtsTe0MXZl1a7+EiD3bTqyfZoG0IPU2jm0mMoAdj9tGx/+e0Rj8rs 9lSxepLMvAOGF2b8mScMPLfspzREYXL17DFwyrErTM2zIyMKR1orwIfnb+TZozAW6Wm/ cwpmt8T1kJiJWlLaUeEVEe+JxBgOMqHJbhRCY= MIME-Version: 1.0 Received: by 10.229.110.2 with SMTP id l2mr625284qcp.27.1249114634816; Sat, 01 Aug 2009 01:17:14 -0700 (PDT) Date: Sat, 1 Aug 2009 18:17:14 +1000 Message-ID: <4d7dd86f0908010117o77757798p6585148ab829e088@mail.gmail.com> From: David N To: freebsd-fs@freebsd.org, aoyama@peach.ne.jp, freebsd-ports@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Subject: iSTGT error messages X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Aug 2009 08:39:47 -0000 Jul 31 01:40:30 netserv1 istgt[13674]: Login from iqn.example.net (10.1.20.15) on iqn.example.net:mail2disk1 LU1 (10.1.10.1:3260,1), ISID=23d010000, TSIH=40, CID=0, HeaderDigest=off, DataDigest=off Jul 31 03:10:23 netserv1 istgt[13674]: istgt_iscsi.c:3338:istgt_iscsi_op_nopout: ***ERROR*** StatSN(460107/460117) error Jul 31 03:10:23 netserv1 istgt[13674]: istgt_iscsi.c:3762:istgt_iscsi_execute: ***ERROR*** iscsi_op_nopout() failed Jul 31 03:10:23 netserv1 istgt[13674]: istgt_iscsi.c:4088:worker: ***ERROR*** iscsi_execute() failed iSTGT istgt-20090428 FreeBSD 7.2-R iSCSI 10GB disk on FreeBSD Open-iscsi 2.0.865-1ubuntu3.3 client Does anyone have any idea what the errors mean? There are alot of repeated messages in the log file. It will connect, then the error will occur and it'll reconnect and so forth. Regards David N From owner-freebsd-fs@FreeBSD.ORG Sat Aug 1 09:11:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DE17D106564A for ; Sat, 1 Aug 2009 09:11:39 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: from mail-fx0-f210.google.com (mail-fx0-f210.google.com [209.85.220.210]) by mx1.freebsd.org (Postfix) with ESMTP id 7940F8FC0A for ; Sat, 1 Aug 2009 09:11:39 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: by fxm6 with SMTP id 6so516616fxm.43 for ; Sat, 01 Aug 2009 02:11:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.102.218.6 with SMTP id q6mr1330717mug.93.1249116616740; Sat, 01 Aug 2009 01:50:16 -0700 (PDT) In-Reply-To: <4A73A096.5050106@chip-web.com> References: <4A712290.9030308@chip-web.com> <46899.11156.qm@web37301.mail.mud.yahoo.com> <4A714B03.6050704@chip-web.com> <4A73A096.5050106@chip-web.com> Date: Sat, 1 Aug 2009 10:50:16 +0200 Message-ID: From: =?ISO-8859-1?Q?Marius_N=FCnnerich?= To: Ludwig Pummer Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS raidz1 pool unavailable from losing 1 device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Aug 2009 09:11:40 -0000 On Sat, Aug 1, 2009 at 03:55, Ludwig Pummer wrote: > Ludwig Pummer wrote: >> >> Simun Mikecin wrote: >>> >>> Ludwin Pummer wrote: >>> >>> >>>> >>>> My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded from >>>> 6.4-STABLE from last year. I just set up a ZFS raidz volume to replace a >>>> graid5 volume I had been using. I had it successfully set up using >>>> partitions across 4 disks, ad{6,8,10,12}s1e. Then I wanted to expand the >>>> raidz volume by merging the space from the adjacent disk partition. I >>>> thought I could just fail out the partition device in ZFS, edit the >>>> bsdlabel, and re-add the larger partition, ZFS would resilver, repeat until >>>> done. That's when I found out that ZFS doesn't let you fail out a device in >>>> a raidz volume. No big deal, I thought, I'll just go to single user mode and >>>> mess with the partition when ZFS isn't looking. When it comes back up it >>>> should notice that one of the device is gone, I can do a 'zfs replace' and >>>> continue my plan. >>>> >>>> Well, after rebooting to single user mode, combining partitions ad12s1d >>>> and ad12s1e (removed the d partiton), "zfs volinit", then "zpool status" >>>> just hung (Ctrl-C didn't kill it, so I rebooted). I thought this was a bit >>>> odd so I thought perhaps ZFS is confused by the ZFS metadata left on >>>> ad12s1e, so I blanked it out with "dd". That didn't help. I changed the name >>>> of the partition to ad12s1d thinking perhaps that would help. After that, >>>> "zfs volinit; zfs mount -a; zpool status" showed my raidz pool UNAVAIL with >>>> the message "insufficient replicas", ad{6,8,10}s1e ONLINE, and ad12s1e >>>> UNAVAIL "cannot open", and a more detailed message pointing me to >>>> http://www.sun.com/msg/ZFS-8000-3C. I tried doing a "zpool replace storage >>>> ad12s1e ad12s1d" but it refused, saying my zpool ("storage") was >>>> unavailable. Ditto for pretty much every zpool command I tried. "zpool >>>> clear" gave me a "permission denied" error. >>>> >>> >>> Was your pool imported while you were repartitioning in single user mode? >>> >> >> Yes, I guess you could say it was. ZFS wasn't loaded while I was doing the >> repartitioning, though. >> >> --Ludwig >> > > Well, I figured out my problem. I didn't actually have a raidz1 volume. I > missed the magic word "raidz" when I performed the "zpool create" so I > created a JBOD. Removing one disk legitmately destroyed my zpool :( > > --Ludwig That's bad. But it won't explain why the disk names changed. I guess there is a race in tasting either the original ad* providers or the one sector smaller label/foo providers. May I suggest that you or other people reading this should try to use gpt labels in the future as they are there definetly _after_ gpt has tasted. Sadly they are only available in 8-current right now. From owner-freebsd-fs@FreeBSD.ORG Sat Aug 1 10:57:38 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 96EA1106564A; Sat, 1 Aug 2009 10:57:38 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 058B18FC18; Sat, 1 Aug 2009 10:57:37 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:46264 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MXCHN-0007FP-3e; Sat, 01 Aug 2009 12:57:35 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id D96ACEE65E; Sat, 1 Aug 2009 12:57:31 +0200 (CEST) Message-Id: <4B49A2A0-2437-48A4-9047-80267BD4148F@exscape.org> From: Thomas Backman To: freebsd-fs@freebsd.org, FreeBSD current Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Sat, 1 Aug 2009 12:57:29 +0200 X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MXCHN-0007FP-3e. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MXCHN-0007FP-3e 0305009ebe3e710f7c1f36d3eddd9cd9 Cc: Subject: Samba + ZFS panic w/ DEBUG_VFS_LOCKS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Aug 2009 10:57:39 -0000 I just installed samba (ports/net/samba3) on my test machine to see if some simple media streaming from ZFS would work. It did not; smbd didn't even start before it panicked... At "Starting smdb" I got the following panic: (Note: I haven't tried without DEBUG_VFS_LOCKS yet. I do suppose that it's not supposed to panic even with rigorous debugging enabled, though!) Unread portion of the kernel message buffer: KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a vfs_badlock() at vfs_badlock+0x95 assert_vop_elocked() at assert_vop_elocked+0x64 VOP_PUTPAGES_APV() at VOP_PUTPAGES_APV+0x5b vnode_pager_putpages() at vnode_pager_putpages+0xa9 vm_pageout_flush() at vm_pageout_flush+0xd1 vm_object_page_collect_flush() at vm_object_page_collect_flush+0x2f0 vm_object_page_clean() at vm_object_page_clean+0x143 fsync() at fsync+0x121 syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (95, FreeBSD ELF64, fsync), rip = 0x801064dac, rsp = 0x7fffffffe5d8, rbp = 0x801336480 --- VOP_PUTPAGES: 0xffffff0007649588 is not exclusive locked but should be KDB: enter: lock violation 0xffffff0007649588: tag zfs, type VREG usecount 2, writecount 1, refcount 3 mountedhere 0 flags (VI_OBJDIRTY) v_object 0xffffff000ee6c000 ref 1 pages 2 lock type zfs: SHARED (count 1) panic: from debugger cpuid = 0 KDB: stack backtrace: Uptime: 17h10m52s Physical memory: 2034 MB Dumping 1723 MB: ... at /usr/src/sys/amd64/amd64/trap.c:613 #9 0xffffffff8057eda7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80613fd5 "vfslock", msg=0xa
) at cpufunc.h:63 #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=0xffffff0007649588, str=0xffffffff80642728 "VOP_PUTPAGES") at /usr/src/sys/kern/vfs_subr.c:3722 #12 0xffffffff805c80eb in VOP_PUTPAGES_APV (vop=0xffffffff807a07c0, a=0xffffff803eb72730) at vnode_if.c:2664 #13 0xffffffff80572cd9 in vnode_pager_putpages (object=0xffffff000ee6c000, m=0xffffff803eb72830, count=8192, sync=12, rtvals=0xffffff803eb727a0) at vnode_if.h:1169 #14 0xffffffff8056d601 in vm_pageout_flush (mc=0xffffff803eb72830, count=2, flags=12) at vm_pager.h:148 #15 0xffffffff80568e30 in vm_object_page_collect_flush ( object=0xffffff000ee6c000, p=Variable "p" is not available. ) at /usr/src/sys/vm/vm_object.c:1032 #16 0xffffffff80569023 in vm_object_page_clean (object=0xffffff000ee6c000, start=0, end=Variable "end" is not available. ) at /usr/src/sys/vm/vm_object.c:844 #17 0xffffffff803d3bd1 in fsync (td=0xffffff0027f45000, uap=Variable "uap" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:3519#18 0xffffffff80598e7f in syscall (frame=0xffffff803eb72c80) at /usr/src/sys/amd64/amd64/ trap.c:984#19 0xffffffff8057f081 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #20 0x0000000801064dac in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 11 #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=0xffffff0007649588, str=0xffffffff80642728 "VOP_PUTPAGES") at /usr/src/sys/kern/vfs_subr.c:3722 3722 vfs_badlock("is not exclusive locked but should be", str, vp); (kgdb) p *vp $1 = {v_type = VREG, v_tag = 0xffffffff80b59327 "zfs", v_op = 0xffffffff80b5dee0, v_data = 0xffffff00052cb758, v_mount = 0xffffff00018392f0, v_nmntvnodes = {tqe_next = 0x0, tqe_prev = 0xffffff006895b028}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0, vu_yield = 0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_hash = 0, v_cache_src = { lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xffffff00076495e8}, v_cache_dd = 0x0, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_lock = {lock_object = {lo_name = 0xffffffff80b59327 "zfs", lo_flags = 91947008, lo_data = 0, lo_witness = 0x0}, lk_lock = 17, lk_timo = 51, lk_pri = 80}, v_interlock = {lock_object = { lo_name = 0xffffffff80614670 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xffffff0007649620, v_holdcnt = 3, v_usecount = 2, v_iflag = 1024, v_vflag = 0, v_writecount = 1, v_freelist = { tqe_next = 0x0, tqe_prev = 0x0}, v_bufobj = {bo_mtx = {lock_object = {lo_name = 0xffffffff80614680 "bufobj interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00076496c0}, bv_root = 0x0, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00076496e0}, bv_root = 0x0, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_ops = 0xffffffff8079d620, bo_bsize = 131072, bo_object = 0xffffff000ee6c000, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xffffff0007649588, __bo_vnode = 0xffffff0007649588}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0xffffff000d402600} (kgdb) fr 17 #17 0xffffffff803d3bd1 in fsync (td=0xffffff0027f45000, uap=Variable "uap" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:3519 3519 vn_finished_write(mp); (kgdb) p *mp $2 = {mnt_mtx = {lock_object = {lo_name = 0xffffffff80613905 "struct mount mtx", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, mnt_gen = 1, mnt_list = {tqe_next = 0xffffff0001bf75e0, tqe_prev = 0xffffff0001839608}, mnt_op = 0xffffffff80b5de40, mnt_vfc = 0xffffffff80b5dde0, mnt_vnodecovered = 0xffffff0001ae6000, mnt_syncer = 0xffffff0001be2760, mnt_ref = 14897, mnt_nvnodelist = { tqh_first = 0xffffff0001be2b10, tqh_last = 0xffffff00076495b0}, mnt_nvnodelistsize = 7449, mnt_writeopcount = 1, mnt_kern_flag = 1610612864, mnt_flag = 268439552, mnt_xflag = 0, mnt_noasync = 0, mnt_opt = 0xffffff00017f1830, mnt_optnew = 0x0, mnt_maxsymlinklen = 0, mnt_stat = {f_version = 537068824, f_type = 4, f_flags = 268439552, f_bsize = 131072, f_iosize = 131072, f_blocks = 485196, f_bfree = 475793, f_bavail = 475793, f_files = 529171, f_ffree = 475793, f_syncwrites = 0, f_asyncwrites = 0, f_syncreads = 0, f_asyncreads = 0, f_spare = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = {val = {591198578, -1876274428}}, f_charspare = '\0' , f_fstypename = "zfs", '\0' , f_mntfromname = "tank/usr", '\0' , f_mntonname = "/usr", '\0' }, mnt_cred = 0xffffff0001be0e00, mnt_data = 0xffffff0001a89000, mnt_time = 0, mnt_iosize_max = 65536, mnt_export = 0x0, mnt_label = 0x0, mnt_hashseed = 2610436692, mnt_lockref = 0, mnt_secondary_writes = 0, mnt_secondary_accwrites = 0, mnt_susp_owner = 0x0, mnt_gjprovider = 0x0, mnt_explock = { lock_object = {lo_name = 0xffffffff80613916 "explock", lo_flags = 91422720, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_timo = 0, lk_pri = 80}} # uname -a FreeBSD chaos.exscape.org 8.0-BETA2 FreeBSD 8.0-BETA2 #7 r195910M: Thu Jul 30 19:03:33 CEST 2009 root@chaos.exscape.org:/usr/obj/usr/src/ sys/DTRACE amd64 As I said, DEBUG_VFS_LOCKS in enabled. Should I disabled DEBUG_VFS_LOCKS and consider this "normal" (if it doesn't still panic, that is), or is this a real issue? (Note that while *mp points to /usr, FWIW, /usr is not shared by samba, nor is any FS below it. Also note that my debugging skills are at an early stage... so the info provided may be useless.) Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Sat Aug 1 14:53:10 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 96513106566C; Sat, 1 Aug 2009 14:53:10 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id EA3428FC15; Sat, 1 Aug 2009 14:53:09 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n71Er2bd004050 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 1 Aug 2009 17:53:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n71Er1xR002982; Sat, 1 Aug 2009 17:53:01 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n71Er1qF002981; Sat, 1 Aug 2009 17:53:01 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 1 Aug 2009 17:53:01 +0300 From: Kostik Belousov To: Thomas Backman Message-ID: <20090801145301.GE1884@deviant.kiev.zoral.com.ua> References: <4B49A2A0-2437-48A4-9047-80267BD4148F@exscape.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="JoaBr9Q1T6GV3lOg" Content-Disposition: inline In-Reply-To: <4B49A2A0-2437-48A4-9047-80267BD4148F@exscape.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, FreeBSD current Subject: Re: Samba + ZFS panic w/ DEBUG_VFS_LOCKS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Aug 2009 14:53:11 -0000 --JoaBr9Q1T6GV3lOg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 01, 2009 at 12:57:29PM +0200, Thomas Backman wrote: > I just installed samba (ports/net/samba3) on my test machine to see if = =20 > some simple media streaming from ZFS would work. It did not; smbd =20 > didn't even start before it panicked... At "Starting smdb" I got the =20 > following panic: >=20 > (Note: I haven't tried without DEBUG_VFS_LOCKS yet. I do suppose that =20 > it's not supposed to panic even with rigorous debugging enabled, =20 > though!) >=20 > Unread portion of the kernel message buffer: > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > vfs_badlock() at vfs_badlock+0x95 > assert_vop_elocked() at assert_vop_elocked+0x64 > VOP_PUTPAGES_APV() at VOP_PUTPAGES_APV+0x5b > vnode_pager_putpages() at vnode_pager_putpages+0xa9 > vm_pageout_flush() at vm_pageout_flush+0xd1 > vm_object_page_collect_flush() at vm_object_page_collect_flush+0x2f0 > vm_object_page_clean() at vm_object_page_clean+0x143 > fsync() at fsync+0x121 > syscall() at syscall+0x28f > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (95, FreeBSD ELF64, fsync), rip =3D 0x801064dac, rsp =3D =20 > 0x7fffffffe5d8, rbp =3D 0x801336480 --- > VOP_PUTPAGES: 0xffffff0007649588 is not exclusive locked but should be > KDB: enter: lock violation >=20 > 0xffffff0007649588: tag zfs, type VREG > usecount 2, writecount 1, refcount 3 mountedhere 0 > flags (VI_OBJDIRTY) > v_object 0xffffff000ee6c000 ref 1 pages 2 > lock type zfs: SHARED (count 1) > panic: from debugger > cpuid =3D 0 > KDB: stack backtrace: > Uptime: 17h10m52s > Physical memory: 2034 MB > Dumping 1723 MB: ... >=20 > at /usr/src/sys/amd64/amd64/trap.c:613 > #9 0xffffffff8057eda7 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #10 0xffffffff8036c8ad in kdb_enter (why=3D0xffffffff80613fd5 "vfslock", > msg=3D0xa
) at cpufunc.h:63 > #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=3D0xffffff0007649588, > str=3D0xffffffff80642728 "VOP_PUTPAGES") > at /usr/src/sys/kern/vfs_subr.c:3722 > #12 0xffffffff805c80eb in VOP_PUTPAGES_APV (vop=3D0xffffffff807a07c0, = =20 > a=3D0xffffff803eb72730) at vnode_if.c:2664 > #13 0xffffffff80572cd9 in vnode_pager_putpages =20 > (object=3D0xffffff000ee6c000, > m=3D0xffffff803eb72830, count=3D8192, sync=3D12, =20 > rtvals=3D0xffffff803eb727a0) > at vnode_if.h:1169 > #14 0xffffffff8056d601 in vm_pageout_flush (mc=3D0xffffff803eb72830, =20 > count=3D2, > flags=3D12) at vm_pager.h:148 > #15 0xffffffff80568e30 in vm_object_page_collect_flush ( > object=3D0xffffff000ee6c000, p=3DVariable "p" is not available. > ) at /usr/src/sys/vm/vm_object.c:1032 > #16 0xffffffff80569023 in vm_object_page_clean =20 > (object=3D0xffffff000ee6c000, > start=3D0, end=3DVariable "end" is not available. > ) at /usr/src/sys/vm/vm_object.c:844 > #17 0xffffffff803d3bd1 in fsync (td=3D0xffffff0027f45000, uap=3DVariable = =20 > "uap" is not available. > ) > at /usr/src/sys/kern/vfs_syscalls.c:3519#18 0xffffffff80598e7f in =20 > syscall (frame=3D0xffffff803eb72c80) at /usr/src/sys/amd64/amd64/=20 > trap.c:984#19 0xffffffff8057f081 in Xfast_syscall () > at /usr/src/sys/amd64/amd64/exception.S:373 > #20 0x0000000801064dac in ?? () > Previous frame inner to this frame (corrupt stack?) >=20 > (kgdb) fr 11 > #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=3D0xffffff0007649588, = =20 > str=3D0xffffffff80642728 "VOP_PUTPAGES") > at /usr/src/sys/kern/vfs_subr.c:3722 > 3722 vfs_badlock("is not exclusive locked but =20 > should be", str, vp); > (kgdb) p *vp > $1 =3D {v_type =3D VREG, v_tag =3D 0xffffffff80b59327 "zfs", v_op =3D =20 > 0xffffffff80b5dee0, v_data =3D 0xffffff00052cb758, > v_mount =3D 0xffffff00018392f0, v_nmntvnodes =3D {tqe_next =3D 0x0, =20 > tqe_prev =3D 0xffffff006895b028}, v_un =3D {vu_mount =3D 0x0, vu_socket = =3D 0x0, > vu_cdev =3D 0x0, vu_fifoinfo =3D 0x0, vu_yield =3D 0}, v_hashlist =3D= =20 > {le_next =3D 0x0, le_prev =3D 0x0}, v_hash =3D 0, v_cache_src =3D { > lh_first =3D 0x0}, v_cache_dst =3D {tqh_first =3D 0x0, tqh_last =3D = =20 > 0xffffff00076495e8}, v_cache_dd =3D 0x0, v_cstart =3D 0, v_lasta =3D 0, > v_lastw =3D 0, v_clen =3D 0, v_lock =3D {lock_object =3D {lo_name =3D = =20 > 0xffffffff80b59327 "zfs", lo_flags =3D 91947008, lo_data =3D 0, > lo_witness =3D 0x0}, lk_lock =3D 17, lk_timo =3D 51, lk_pri =3D 80}= , =20 > v_interlock =3D {lock_object =3D { > lo_name =3D 0xffffffff80614670 "vnode interlock", lo_flags =3D =20 > 16973824, lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =3D 4}, > v_vnlock =3D 0xffffff0007649620, v_holdcnt =3D 3, v_usecount =3D 2, =20 > v_iflag =3D 1024, v_vflag =3D 0, v_writecount =3D 1, v_freelist =3D { > tqe_next =3D 0x0, tqe_prev =3D 0x0}, v_bufobj =3D {bo_mtx =3D =20 > {lock_object =3D {lo_name =3D 0xffffffff80614680 "bufobj interlock", > lo_flags =3D 16973824, lo_data =3D 0, lo_witness =3D 0x0}, mtx_lo= ck =20 > =3D 4}, bo_clean =3D {bv_hd =3D {tqh_first =3D 0x0, > tqh_last =3D 0xffffff00076496c0}, bv_root =3D 0x0, bv_cnt =3D 0},= =20 > bo_dirty =3D {bv_hd =3D {tqh_first =3D 0x0, tqh_last =3D 0xffffff00076496= e0}, > bv_root =3D 0x0, bv_cnt =3D 0}, bo_numoutput =3D 0, bo_flag =3D 0, = =20 > bo_ops =3D 0xffffffff8079d620, bo_bsize =3D 131072, > bo_object =3D 0xffffff000ee6c000, bo_synclist =3D {le_next =3D 0x0, = =20 > le_prev =3D 0x0}, bo_private =3D 0xffffff0007649588, > __bo_vnode =3D 0xffffff0007649588}, v_pollinfo =3D 0x0, v_label =3D = =20 > 0x0, v_lockf =3D 0xffffff000d402600} >=20 > (kgdb) fr 17 > #17 0xffffffff803d3bd1 in fsync (td=3D0xffffff0027f45000, uap=3DVariable = =20 > "uap" is not available. > ) at /usr/src/sys/kern/vfs_syscalls.c:3519 > 3519 vn_finished_write(mp); > (kgdb) p *mp > $2 =3D {mnt_mtx =3D {lock_object =3D {lo_name =3D 0xffffffff80613905 "str= uct =20 > mount mtx", lo_flags =3D 16973824, lo_data =3D 0, lo_witness =3D 0x0}, > mtx_lock =3D 4}, mnt_gen =3D 1, mnt_list =3D {tqe_next =3D =20 > 0xffffff0001bf75e0, tqe_prev =3D 0xffffff0001839608}, mnt_op =3D =20 > 0xffffffff80b5de40, > mnt_vfc =3D 0xffffffff80b5dde0, mnt_vnodecovered =3D =20 > 0xffffff0001ae6000, mnt_syncer =3D 0xffffff0001be2760, mnt_ref =3D 14897,= =20 > mnt_nvnodelist =3D { > tqh_first =3D 0xffffff0001be2b10, tqh_last =3D 0xffffff00076495b0}, = =20 > mnt_nvnodelistsize =3D 7449, mnt_writeopcount =3D 1, > mnt_kern_flag =3D 1610612864, mnt_flag =3D 268439552, mnt_xflag =3D 0, = =20 > mnt_noasync =3D 0, mnt_opt =3D 0xffffff00017f1830, mnt_optnew =3D 0x0, > mnt_maxsymlinklen =3D 0, mnt_stat =3D {f_version =3D 537068824, f_type = =3D =20 > 4, f_flags =3D 268439552, f_bsize =3D 131072, f_iosize =3D 131072, > f_blocks =3D 485196, f_bfree =3D 475793, f_bavail =3D 475793, f_files= =3D =20 > 529171, f_ffree =3D 475793, f_syncwrites =3D 0, f_asyncwrites =3D 0, > f_syncreads =3D 0, f_asyncreads =3D 0, f_spare =3D {0, 0, 0, 0, 0, 0,= =20 > 0, 0, 0, 0}, f_namemax =3D 255, f_owner =3D 0, f_fsid =3D {val =3D {59119= 8578, > -1876274428}}, f_charspare =3D '\0' , =20 > f_fstypename =3D "zfs", '\0' , > f_mntfromname =3D "tank/usr", '\0' , f_mntonname = =20 > =3D "/usr", '\0' }, mnt_cred =3D 0xffffff0001be0e00, > mnt_data =3D 0xffffff0001a89000, mnt_time =3D 0, mnt_iosize_max =3D =20 > 65536, mnt_export =3D 0x0, mnt_label =3D 0x0, mnt_hashseed =3D 2610436692, > mnt_lockref =3D 0, mnt_secondary_writes =3D 0, mnt_secondary_accwrites = =20 > =3D 0, mnt_susp_owner =3D 0x0, mnt_gjprovider =3D 0x0, mnt_explock =3D { > lock_object =3D {lo_name =3D 0xffffffff80613916 "explock", lo_flags = =3D =20 > 91422720, lo_data =3D 0, lo_witness =3D 0x0}, lk_lock =3D 1, lk_timo =3D = 0, > lk_pri =3D 80}} >=20 > # uname -a > FreeBSD chaos.exscape.org 8.0-BETA2 FreeBSD 8.0-BETA2 #7 r195910M: Thu = =20 > Jul 30 19:03:33 CEST 2009 root@chaos.exscape.org:/usr/obj/usr/src/=20 > sys/DTRACE amd64 >=20 > As I said, DEBUG_VFS_LOCKS in enabled. > Should I disabled DEBUG_VFS_LOCKS and consider this "normal" (if it =20 > doesn't still panic, that is), or is this a real issue? > (Note that while *mp points to /usr, FWIW, /usr is not shared by =20 > samba, nor is any FS below it. Also note that my debugging skills are =20 > at an early stage... so the info provided may be useless.) It does not matter whether the zfs is accessed by samba. Panic happens when you do fsync(2) on a vnode that has its vm object marked as dirty, and VFS_DEBUG_LOCKS is configured. The workaround is to disable VFS_DEBUG_LOCKS. Since vnode_pager_generic_put= pages seems to work with shared vnode lock as far as VOP_WRITE works right = with shared lock, change sys/kern/vnode_if.src, line 475 from %% putpages vp E E E to %% putpages vp L L L --JoaBr9Q1T6GV3lOg Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkp0Vs0ACgkQC3+MBN1Mb4iXWQCfSvJ5g6W/YqD+gFMdnJXQ+IX9 /SQAnj+H+cNU8V5EQcOit2OpemYVGycP =j2WY -----END PGP SIGNATURE----- --JoaBr9Q1T6GV3lOg-- From owner-freebsd-fs@FreeBSD.ORG Sat Aug 1 17:41:10 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C85AC106564A for ; Sat, 1 Aug 2009 17:41:10 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from kientzle.com (kientzle.com [66.166.149.50]) by mx1.freebsd.org (Postfix) with ESMTP id 54C828FC18 for ; Sat, 1 Aug 2009 17:41:10 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: (from root@localhost) by kientzle.com (8.14.3/8.14.3) id n71HC7ir061638; Sat, 1 Aug 2009 10:12:07 -0700 (PDT) (envelope-from kientzle@freebsd.org) Received: from dark.x.kientzle.com (fw2.kientzle.com [10.123.1.2]) by kientzle.com with SMTP id vbwmjdtehhcjv7yi7zxb5zd3pi; Sat, 01 Aug 2009 10:12:07 -0700 (PDT) (envelope-from kientzle@freebsd.org) Message-ID: <4A747766.10901@freebsd.org> Date: Sat, 01 Aug 2009 10:12:06 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.21) Gecko/20090601 SeaMonkey/1.1.16 MIME-Version: 1.0 To: "James R. Van Artsdalen" References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B2DA.9060902@freebsd.org> <4A72D946.4090401@jrv.org> In-Reply-To: <4A72D946.4090401@jrv.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek , Thomas Backman , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Aug 2009 17:41:11 -0000 James R. Van Artsdalen wrote: > Andriy Gapon wrote: >> >> One comment on the patch - I personally don't like bit-wise xor in a logical >> expression. But if otherwise the expression would be huge and ugly, then OK. > > If you're going to code an XOR, use an XOR. Or != which produces the same result for logical values and is sometimes easier to understand. Tim