From owner-freebsd-stable@FreeBSD.ORG  Wed Jan  2 17:40:54 2013
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: stable@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5F6322501;
 Wed,  2 Jan 2013 17:40:54 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 by mx1.freebsd.org (Postfix) with ESMTP id B1CF918A2;
 Wed,  2 Jan 2013 17:40:53 +0000 (UTC)
Received: from tom.home (kostik@localhost [127.0.0.1])
 by kib.kiev.ua (8.14.5/8.14.5) with ESMTP id r02HeiBN043937;
 Wed, 2 Jan 2013 19:40:44 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.7.3 kib.kiev.ua r02HeiBN043937
Received: (from kostik@localhost)
 by tom.home (8.14.5/8.14.5/Submit) id r02Heikc043936;
 Wed, 2 Jan 2013 19:40:44 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Wed, 2 Jan 2013 19:40:44 +0200
From: Konstantin Belousov <kostikbel@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: NFS-exported ZFS instability
Message-ID: <20130102174044.GB82219@kib.kiev.ua>
References: <20130102.105304.1817355190360003433.hrs@allbsd.org>
 <1914428061.1617223.1357133079421.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="Sz/4MAOlM1c8JZ8/"
Content-Disposition: inline
In-Reply-To: <1914428061.1617223.1357133079421.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home
Cc: alc <alc@freebsd.org>, stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Jan 2013 17:40:54 -0000


--Sz/4MAOlM1c8JZ8/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jan 02, 2013 at 08:24:39AM -0500, Rick Macklem wrote:
> Hiroki Sato wrote:
> > Hello,
> >=20
> > I have been in a trouble about my NFS server for a long time. The
> > symptom is that it stops working in one or two weeks after a boot. I
> > could not track down the cause yet, but it is reproducible and only
> > occurred under a very high I/O load.
> >=20
> > It did not panic, just stopped working---while it responded to ping,
> > userland programs seemed not working. I could break it into DDB and
> > get a kernel dump. The following URLs are a log of ps, trace, and
> > etc.:
> >=20
> > http://people.allbsd.org/~hrs/FreeBSD/pool.log.20130102
> > http://people.allbsd.org/~hrs/FreeBSD/pool.dmesg.20130102
> >=20
> > Does anyone see how to debug this? I guess this is due to a deadlock
> > somewhere. I have suffered from this problem for almost two years.
> > The above log is from stable/9 as of Dec 19, but this have persisted
> > since 8.X.
> >=20
> Well, I took a quick glance at the log and there are a lot of processes
> sleeping on "pfault" (in vm_waitpfault() in sys/vm/vm_page.c). I'm no
> vm guy, so I'm not sure when/why that will happen. The comment on the
> function suggests they are waiting for free pages.
>=20
> Maybe something as simple as running out of swap space or a problem
> talking to the disk(s) that has the swap partition(s) or ???
> (I'm talking through my hat here, because I'm not conversant with
>  the vm side of things.)
>=20
> I might take a closer look this evening and see if I can spot anything
> in the log, rick
> ps: I hope Alan and Kostik don't mind being added to the cc list.

What I see in the log is that the lock cascade rooted in the thread
100838, which owns system map mutex. I believe this prevents malloc(9)
=66rom making a progress in other threads, which e.g. own the ZFS vnode
locks. As the result, the whole system wedged.

Looking back at the thread 100838, we can see that it executes
smp_tlb_shootdown(). It is impossible to tell from the static dump,
is the appearance of the smp_tlb_shootdown() in the backtrace is
transient, or the thread is spinning there, waiting for other CPUs to
acknowledge the request. But, since the system wedged, most likely,
smp_tlb_shootdown spins.

Taking this hypothesis, the situation can occur, most likely, due to
some other core running with the interrupts disabled. Inspection of the
backtraces of the processes running on all cores does not show any which
could legitimately own a spinlock or otherwise run with the interrupts
disabled.

One thing you could try to do is to enable WITNESS for the spinlocks,
to try to catch the leaked spinlock. I very much doubt that this is
the case.

Another thing to try is to switch the CPU idle method to something
else. Look at the machdep.idle* sysctls. It could be some CPU errata
which blocks wakeup due the interrupt in some conditions in C1 ?

--Sz/4MAOlM1c8JZ8/
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJQ5HEbAAoJEJDCuSvBvK1BHqAP/3n+4ZOqUmrwGefE+uorAie+
2DM6KRS0V0aoGdPvVJRKQlKFRIZ59skFbaYuqrqXrX6by0q2eBXi6gCatZZ1jdP4
RX20zfK9awYnoxAHP4aNILPQrNN5Gfkgdia5tFnygCSJBuAWKDkaeW4yGpLFoDn9
t+ztc7WN7eCC7eXAVL1DGPIkhBvsmaQAjy6uiF4COruO6EopgopIPeQIBHfJgxIb
uFeYeGIs4iKCh8C7ErZp2AXI4STKAidwanrKgriq6nnO3oaccKtAV+f5xA6o2Zp1
10t3ikWujhN+6saTPRoDZ0ydDvcHKKOq1d5WLUsMdUcyn5E9rmz4Siwfp4R7MfoJ
DhsTHZ8riCkzNbj32lT+Yp3DfV3+bemqQav17xW/sS+Y3h+LQ9FP+Ko3ukxBJcqH
JFIqpsGFiltW1HWwTJzULGSvw7OJjtWIM9IN6dDfWAu/Hy5P+lOyfMvXuiZIizYQ
6uLm8xVBs8ayE8IJQaUR6BsD7Mk2r1/9pjAOvmfb+PExN3mf8jqW54eTt6PcYXAn
tsx2rVbR580J/TJbCEuFAlZARv/ohiQWIgAHSkcRgzsQbID+qVQI0y5Ce5n0ji72
VM6mLWdKX0h0nEmrCLw+kgiB6ZggRwTycZNCrxf7MPEGgUyQbvZPIE9M2Blol/dZ
XK5s6u/jYkNhsiJSgqNt
=n3YC
-----END PGP SIGNATURE-----

--Sz/4MAOlM1c8JZ8/--