Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 7 Jan 2011 21:52:57 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        freebsd-stable@freebsd.org, Ronald Klop <ronald-freebsd8@klop.yi.org>
Subject:   Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
Message-ID:  <20110107195257.GF12599@deviant.kiev.zoral.com.ua>
In-Reply-To: <1542786719.258389.1294429045433.JavaMail.root@erie.cs.uoguelph.ca>
References:  <op.voxs8lqx8527sy@212-123-145-58.ip.telfort.nl> <1542786719.258389.1294429045433.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help

--0NUq3YI6/rBc6tSL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jan 07, 2011 at 02:37:25PM -0500, Rick Macklem wrote:
> > Hi,
> >=20
> > OpenOffice hangs on NFS when I try to save a file or even when I try
> > to
> > open the save dialog in this case.
> >=20
> >=20
> > $ 17:25:35 ronald@ronald [~]
> > procstat -kk 85575
> > PID TID COMM TDNAME KSTACK
> > 85575 100322 soffice.bin initial thread mi_switch+0x176
> > sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39
> > VOP_LOCK1_APV+0x46
> > _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8
> > nfs_lookup+0x65e
> > VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82
> > kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40
> > 85575 100502 soffice.bin - mi_switch+0x176
> > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0
> > do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186
> > syscall+0x40
> > Xfast_syscall+0xe2
> > 85575 100576 soffice.bin - mi_switch+0x176
> > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0
> > do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186
> > syscall+0x40
> > Xfast_syscall+0xe2
> > 85575 100577 soffice.bin - mi_switch+0x176
> > sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d
> > kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40
> > Xfast_syscall+0xe2
> > 85575 100578 soffice.bin - mi_switch+0x176
> > sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e
> > seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40
> > Xfast_syscall+0xe2
> > 85575 100579 soffice.bin - mi_switch+0x176
> > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12
> > _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186
> > syscall+0x40 Xfast_syscall+0xe2
> >=20
> > $ 17:25:35 ronald@ronald [~]
> > uname -a
> > FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE
> > #6:
> > Mon Dec 27 23:49:30 CET 2010
> > root@ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64
> >=20
> I think all the above tells us is that the thread is waiting for
> a vnode lock. The question then becomes "what is holding a lock
> on that vnode and why?".
>=20
> > It is not possible to exit or kill soffice.bin. I had a slighty
> > different
> > procstat stack before, but that was fixed a couple of days ago.
>=20
> Yea, it will be in an uniterruptible sleep when waiting for a vnode lock.
>=20
> > Any thoughts? Enabling local locks in NFS doesn't fix it.
>=20
> Here's some things you could try:
> 1 - apply the attached patch. It fixes a known problem w.r.t. the
>     client side of the krpc. Not likely to fix this, but I can hope:-)
1a - Look around of other processes in the uninterruptible sleep state,
quite possible, one of them also owns the lock the openoffice is waiting
for. Also see
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kernel=
debug-deadlocks.html

Of the particular interest are the witness output and backtraces for
all threads that are reported by witness as owning the vnode locks.

> 2 - If #1 doesn't fix the problem:
>     - before making it hang, start capturing packets via:
>     # tcpdump -s 0 -w xxx host server
>     - then make it hang, kill the above and
>     # procstat -ka
>     # ps axHlww
>     and capture the output of both of these. Hopefully these 2 commands
>     will indicate what is holding the vnode lock and maybe, why. The
>     "xxx" file can be looked at in wireshark to see what/if any NFS
>     traffic is happening.
>     If you aren't comfortable looking at the above, you can email them
>     to me and I'll take a stab at them someday.
> 3 - Try the experimental client to see if it behaves differently. The
>     mount command is:
>     # mount -t newnfs -o nfsv3,<the options you already use> server:/path=
 /mntpath
>     (This might ideantify if the regular client has an infrequently execu=
ted code
>      path that forgets to unlock the vnode, since it uses a somewhat diff=
erent RPC
>      layer. The buffer cache handling etc are almost the same, but the RP=
C stuff is
>      fairly different.)
>=20
> > The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26.
> >=20
> I'm afraid I can't blame Linux (at least not until we have more info;-).
>=20
> > If more info is needed. I can easily reproduce this.
>=20
> See above #2.
>=20
> Good luck with it and let us know how it goes, rick
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"

--0NUq3YI6/rBc6tSL
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk0nbxkACgkQC3+MBN1Mb4jv/QCg6Czgkei9N4ZLz0yG7HR8YPw6
nksAoN4FiVoabV7SHS5aHpAs7Kam28DN
=8Fzg
-----END PGP SIGNATURE-----

--0NUq3YI6/rBc6tSL--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110107195257.GF12599>