Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Jul 2014 19:59:17 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Harald Schmalzbauer <h.schmalzbauer@omnilan.de>
Cc:        FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   Re: panic/lock on 9.3-RELEASE with nullfs/nfs/zfs combination
Message-ID:  <20140724165917.GT93733@kib.kiev.ua>
In-Reply-To: <53D12973.3010805@omnilan.de>
References:  <53D12973.3010805@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help

--5uA32zzA1LfAcUy7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jul 24, 2014 at 05:42:43PM +0200, Harald Schmalzbauer wrote:
>  Hello,
>=20
> I'm running 9.3-amd64 with some zfilesystems and a jail.
>=20
> One zfilesystem is nullfs_mounted into jail.
>=20
> Now I can export (nfsv4) that nullfs_mounted filesystem and rw-opening a
> file inside the jail from the nullfs_mounted fs works, until a client
> walks into nfs_mounted filesystem (just listing directory contents e.g.).
> So mount shows like this:
>=20
> tank/my/fs15 mounted on /zfs/netshares/fs15 (zfs, NFS exported, local,
> noatime, noexec, nosuid, nfsv4acls)
> /zfs/netshares/fs15 on /.JAIL/usr/ports (nullfs, local)
>=20
>=20
> When I the try to open a file (rw) inside the jail from the
> nullfs_mounted filesystem, 9.3-RELEASE blocks any IO completely on that
> filesystem (local or remote),
> with debug-kernel I get the following panic on the nfs/jail server:
>=20
> panic: LK_RETRY set with incompatible flags (0x200400) or an error
> occured (11)
> cpuid =3D 3
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame
> 0xffffff82e54bcc70
> kdb_backtrace() at kdb_backtrace+0x37/frame 0xffffff82e54bcd30
> panic() at panic+0x1cd/frame 0xffffff82e54bce30
> _vn_lock() at _vn_lock+0x67/frame 0xffffff82e54bce90
> zfs_lookup() at zfs_lookup+0x420/frame 0xffffff82e54bcf20
> zfs_freebsd_lookup() at zfs_freebsd_lookup+0xa6/frame 0xffffff82e54bd070
> VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0xd8/frame 0xffffff82e54bd=
0a0
> vfs_cache_lookup() at vfs_cache_lookup+0xff/frame 0xffffff82e54bd110
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xd8/frame 0xffffff82e54bd140
> null_lookup() at null_lookup+0x92/frame 0xffffff82e54bd1c0
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xd8/frame 0xffffff82e54bd1f0
> lookup() at lookup+0x389/frame 0xffffff82e54bd290
> namei() at namei+0x3df/frame 0xffffff82e54bd340
> vn_open_cred() at vn_open_cred+0x1e2/frame 0xffffff82e54bd4b0
> vop_stdvptocnp() at vop_stdvptocnp+0x1af/frame 0xffffff82e54bd7e0
> null_vptocnp() at null_vptocnp+0xf5/frame 0xffffff82e54bd850
> VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0xdb/frame 0xffffff82e54bd880
> vn_vptocnp_locked() at vn_vptocnp_locked+0x15b/frame 0xffffff82e54bd910
> vn_fullpath1() at vn_fullpath1+0x100/frame 0xffffff82e54bd970
> kern___getcwd() at kern___getcwd+0xd4/frame 0xffffff82e54bd9d0
> amd64_syscall() at amd64_syscall+0x318/frame 0xffffff82e54bdaf0
> Xfast_syscall() at Xfast_syscall+0xf7/frame 0xffffff82e54bdaf0
> --- syscall (326, FreeBSD ELF64, sys___getcwd), rip =3D 0x8011a191c, rsp =
=3D
> 0x7fffffffe658, rbp =3D 0x801873400 ---
> KDB: enter: panic
> [ thread pid 1905 tid 100856 ]
> Stopped at kdb_enter+0x3b: movq $0,0x642172(%rip)
>=20
> Like mentioned, this panic happens only if a nfs(v4) client visits fs15
> (the exported and nullfs_mounted fs) and I try to rw-open any file on
> the nullfs afterwards!!!
>=20
> How can I provide useful info with KDB? I don't have a dumpdev available
> in that machine???
> http://www.es.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gd=
b.html
> seems not applicaple, no /var/crash/?*???
>=20

The lockmgr flags are LK_SHARE | LK_RETRY, and error 11 =3D=3D EDEADLK
indicates that the lock is already taken by the curthread in the
exclusive mode. I am interested in what line of code did the locking.

Add ddb, INVARIANTS, WITNESS and DEBUG_VFS_LOCKS options to the kernel
config, reproduce the issue and, after the panic occured and you
get at the ddb prompt, issue command 'show alllocks'.

Also, do 'show mount', after which do 'show mount <addr>', where <addr>
is the address of your nullfs mount point, printed by 'show mount'.

I need all console output starting from the panic message.

--5uA32zzA1LfAcUy7
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBAgAGBQJT0TtkAAoJEJDCuSvBvK1BPGwQAJR3iPL2kEuPZzojmUwI4Hem
xuULE5t+LhdfiueSrkn4t9Pxh67umQA2CKffyPOeWrdWoUjgINFxWuccwfFBVuoz
mWDVrLlrxQFLPZeIKRhYKEhNGueZWkVych0YzRF/yQu4Yl9KBkfpt/FeHB8g/Sfy
ZHkV93dLKIRoIaPOEXla79S9f/xUrLakXsjNa/HxfDunQqO2bZbCbSBX3oQr1NgL
VcBOXyMaJtPRSKhYNAtPvDNfLnHK4wyVHZsEApCv/IprNvH8eBVcKrgHr0hnZrwT
RXT4g1kdOJi+n8NPQhNMb9hv6mz8X3GFmeGFbalhRdWuCX2r5UgXYq0NpkXCSOG1
h+zdfdDvwnk4BEUUEhQv2lqvZyRQjv99c3Rl1K66AVMMY+G32QQewzZctug8AYnD
+w5I3OFspO2V/i4VulAL0wB7BA2E2hw/Is6qc+pBIM5w1c3W9RtHGdZulVTsBRAU
oLq8llSv86zIboXOf1QH5pJz/aJb+aPVI3MjLNJmMk98mEdq4SSgfXeQsbWZ5pO9
AG+jYL5BM9inLhSviA7FIAi85suO85dOIM8FVv5rxy/0IIj9/Ig9vaZQ3oEmr6RE
aB7WCZbnQHz3Sbv+26rMUvNbcE8S1e7F5exZzbWTlE+uBkly/M9dFBzWKaPZTnJX
Vi/G9gYs0csBEn9OygPb
=K3KI
-----END PGP SIGNATURE-----

--5uA32zzA1LfAcUy7--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140724165917.GT93733>