From owner-freebsd-fs@FreeBSD.ORG  Thu Aug 22 10:13:50 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id DE8FB27D;
 Thu, 22 Aug 2013 10:13:50 +0000 (UTC)
 (envelope-from lists@yamagi.org)
Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7])
 by mx1.freebsd.org (Postfix) with ESMTP id 5EA6E2F79;
 Thu, 22 Aug 2013 10:13:50 +0000 (UTC)
Received: from lennart.pwag-local.de (unknown [212.48.125.109])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mail.yamagi.org (Postfix) with ESMTPSA id 154E61666312;
 Thu, 22 Aug 2013 12:13:47 +0200 (CEST)
Date: Thu, 22 Aug 2013 12:13:41 +0200
From: Yamagi Burmeister <lists@yamagi.org>
To: freebsd-fs@freebsd.org
Subject: Re: 9.2-RC1: LORs / Deadlock with SU+J on HAST in "memsync" mode
Message-Id: <20130822121341.0f27cb5e372d12bab8725654@yamagi.org>
In-Reply-To: <20130819115101.ae9c0cf788f881dc4de464c5@yamagi.org>
References: <20130819115101.ae9c0cf788f881dc4de464c5@yamagi.org>
X-Mailer: Sylpheed 3.3.0 (GTK+ 2.24.19; amd64-portbld-freebsd9.2)
Mime-Version: 1.0
Content-Type: multipart/signed; protocol="application/pgp-signature";
 micalg="PGP-SHA1";
 boundary="Signature=_Thu__22_Aug_2013_12_13_41_+0200_1edTGGGYQbPa+y4n"
Cc: trociny@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Aug 2013 10:13:50 -0000

--Signature=_Thu__22_Aug_2013_12_13_41_+0200_1edTGGGYQbPa+y4n
Content-Type: text/plain; charset=US-ASCII
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hello Mikolaj and Pawel,
I've been told that the issue below is most likely a problem with HAST.
Since it's your pet your're my next victims. :)

A short summary:
After having some systems upgraded to FreeBSD 9.2-RC1/RC2 and switched
HAST to the new "memsync" mode I've seen processes getting stuck when
accessing files on UFS filesystems with SU+J enabled. Testing showed
that this only seems to happen (while I couldn't reproduce it in other
combinations I'm not quite sure if its really the case) when HAST is
running in "memsync" mode and the UFS filesystem on HAST has SU+J
enabled. It can be reproduced easily with the instructions below.=20

Since my first mail I've done another DDB session, bit I'm not sure if
it's helpfull since I focused on deadlocks. It can be found here:
http://deponie.yamagi.org/freebsd/debug/lor_hast/ddb2.txt

Some help with this issue would be appreciated. At leats for me HAST
is currently usable when running in "memsync" mode with UFS and SU+J on
it.

As said before, more information can be provided.

Ciao,
Yamagi

On Mon, 19 Aug 2013 11:51:01 +0200
Yamagi Burmeister <lists@yamagi.org> wrote:

> Hello,
> in the last week I've seen several deadlocked processes (stucked in
> state "vofflock" or "wswbuf0" on an UFS2 filesystem with SU+J on HAST
> with "memsync" mode. TheThe deadlock disappear if either SU+J is
> disabled (only SU are active) or HAST is switched to "fullsync" mode.
>=20
> The system is:
> FreeBSD tvtransfer.local 9.2-RC1 FreeBSD 9.2-RC1 #0 r254355M:
> Fri Aug 16 12:35:30 UTC 2013
> support@tvtransfer.local:/usr/obj/usr/src/sys/GENERIC  amd64
>=20
> I've build a kernel with full debugging support and seen this two LOR
> beside the known false positive between bufwait and dirhash:
>=20
> lock order reversal:
>  1st 0xfffffe0197310d68 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2223
>  2nd 0xffffff87be8be0f8 bufwait (bufwait)
> @ /usr/src/sys/ufs/ffs/ffs_vnops.c:261 3rd 0xfffffe019746f098 ufs (ufs)
> @ /usr/src/sys/kern/vfs_subr.c:2223 KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame
> 0xffffff88715dcd50 kdb_backtrace() at kdb_backtrace+0x37/frame
> 0xffffff88715dce10 _witness_debugger() at _witness_debugger+0x2c/frame
> 0xffffff88715dce30 witness_checkorder() at witness_checkorder
> +0x875/frame 0xffffff88715dcef0 __lockmgr_args() at __lockmgr_args
> +0x1141/frame 0xffffff88715dcfd0 ffs_lock() at ffs_lock+0x9c/frame
> 0xffffff88715dd020 VOP_LOCK1_APV() at VOP_LOCK1_APV+0xe3/frame
> 0xffffff88715dd050 _vn_lock() at _vn_lock+0x55/frame 0xffffff88715dd0b0
> vget() at vget+0x7b/frame 0xffffff88715dd100
> vfs_hash_get() at vfs_hash_get+0xd5/frame 0xffffff88715dd150
> ffs_vgetf() at ffs_vgetf+0x48/frame 0xffffff88715dd1d0
> softdep_sync_buf() at softdep_sync_buf+0x547/frame 0xffffff88715dd2a0
> ffs_syncvnode() at ffs_syncvnode+0x2c1/frame 0xffffff88715dd310
> ffs_truncate() at ffs_truncate+0x10bb/frame 0xffffff88715dd560
> ufs_direnter() at ufs_direnter+0x550/frame 0xffffff88715dd630
> ufs_makeinode() at ufs_makeinode+0x355/frame 0xffffff88715dd7f0
> VOP_CREATE_APV() at VOP_CREATE_APV+0x102/frame 0xffffff88715dd820
> vn_open_cred() at vn_open_cred+0x45e/frame 0xffffff88715dd990
> kern_openat() at kern_openat+0x20a/frame 0xffffff88715ddb10
> amd64_syscall() at amd64_syscall+0x2f9/frame 0xffffff88715ddc30
> Xfast_syscall() at Xfast_syscall+0xf7/frame 0xffffff88715ddc30
> --- syscall (5, FreeBSD ELF64, sys_open), rip =3D 0x800b38d3c, rsp =3D
> 0x7fffffffd128, rbp =3D 0x7fffffffda40 ---=20
>=20
> lock order reversal:
>  1st 0xfffffe0035dfbf30 so_snd_sx (so_snd_sx)
> @ /usr/src/sys/kern/uipc_sockbuf.c:145 2nd 0xfffffe02ead2b328 ufs (ufs)
> @ /usr/src/sys/kern/uipc_syscalls.c:2062 KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame
> 0xffffff8871515530 kdb_backtrace() at kdb_backtrace+0x37/frame
> 0xffffff88715155f0 _witness_debugger() at _witness_debugger+0x2c/frame
> 0xffffff8871515610 witness_checkorder() at witness_checkorder
> +0x875/frame 0xffffff88715156d0 __lockmgr_args() at __lockmgr_args
> +0x1446/frame 0xffffff88715157b0 ffs_lock() at ffs_lock+0x9c/frame
> 0xffffff8871515800 VOP_LOCK1_APV() at VOP_LOCK1_APV+0xe3/frame
> 0xffffff8871515830 _vn_lock() at _vn_lock+0x55/frame 0xffffff8871515890
> kern_sendfile() at kern_sendfile+0x8e7/frame 0xffffff8871515ab0
> do_sendfile() at do_sendfile+0xdc/frame 0xffffff8871515b10
> amd64_syscall() at amd64_syscall+0x2f9/frame 0xffffff8871515c30
> Xfast_syscall() at Xfast_syscall+0xf7/frame 0xffffff8871515c30
> --- syscall (393, FreeBSD ELF64, sys_sendfile), rip =3D 0x8022075bc, rsp
> =3D 0x7fffffffb268, rbp =3D 0x7fffffffd570 ---
>=20
> Both LORs can be found in a more readable format here:
> http://deponie.yamagi.org/freebsd/debug/lor_hast/lor.txt
>=20
> The transcript of a DDB session with all informations requested in
> chapter "10.7. Debugging Deadlocks" of the developers handbook can
> be found here:
> http://deponie.yamagi.org/freebsd/debug/lor_hast/ddb.txt
>=20
> Reproducing this problem is easy:
> 1. Create a HAST setup in "memsync" mode with both primary and
>    secondary connected
> 2. "newfs -U -j /dev/hast/$device" on the primary
> 3. mount "mount /dev/hast/$device /mnt"
> 4. Create some load on /mnt. For example copy /usr/ports in
>    an endless loop. Sooner or later processes accessing /mnt
>    will deadlock. This may be easer to reproduce while HAST
>    is performing the initial sync.
>=20
> More information can be provided if necessary.
>=20
> Thanks,
> Yamagi
>=20
> --=20
> Homepage:  www.yamagi.org
> XMPP:      yamagi@yamagi.org
> GnuPG/GPG: 0xEFBCCBCB


--=20
Homepage:  www.yamagi.org
XMPP:      yamagi@yamagi.org
GnuPG/GPG: 0xEFBCCBCB

--Signature=_Thu__22_Aug_2013_12_13_41_+0200_1edTGGGYQbPa+y4n
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (FreeBSD)

iEYEARECAAYFAlIV5FsACgkQWTjlg++8y8vAsQCePDi5/Xerr+f03wsHtkNzPYfI
kYgAnim9Z/ZEw3zBIz38W9qNBw4epFma
=Iecv
-----END PGP SIGNATURE-----

--Signature=_Thu__22_Aug_2013_12_13_41_+0200_1edTGGGYQbPa+y4n--