From owner-freebsd-stable@FreeBSD.ORG Wed Jan 2 17:40:54 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5F6322501; Wed, 2 Jan 2013 17:40:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id B1CF918A2; Wed, 2 Jan 2013 17:40:53 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.5/8.14.5) with ESMTP id r02HeiBN043937; Wed, 2 Jan 2013 19:40:44 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.3 kib.kiev.ua r02HeiBN043937 Received: (from kostik@localhost) by tom.home (8.14.5/8.14.5/Submit) id r02Heikc043936; Wed, 2 Jan 2013 19:40:44 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 2 Jan 2013 19:40:44 +0200 From: Konstantin Belousov To: Rick Macklem Subject: Re: NFS-exported ZFS instability Message-ID: <20130102174044.GB82219@kib.kiev.ua> References: <20130102.105304.1817355190360003433.hrs@allbsd.org> <1914428061.1617223.1357133079421.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Sz/4MAOlM1c8JZ8/" Content-Disposition: inline In-Reply-To: <1914428061.1617223.1357133079421.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: alc , stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2013 17:40:54 -0000 --Sz/4MAOlM1c8JZ8/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 02, 2013 at 08:24:39AM -0500, Rick Macklem wrote: > Hiroki Sato wrote: > > Hello, > >=20 > > I have been in a trouble about my NFS server for a long time. The > > symptom is that it stops working in one or two weeks after a boot. I > > could not track down the cause yet, but it is reproducible and only > > occurred under a very high I/O load. > >=20 > > It did not panic, just stopped working---while it responded to ping, > > userland programs seemed not working. I could break it into DDB and > > get a kernel dump. The following URLs are a log of ps, trace, and > > etc.: > >=20 > > http://people.allbsd.org/~hrs/FreeBSD/pool.log.20130102 > > http://people.allbsd.org/~hrs/FreeBSD/pool.dmesg.20130102 > >=20 > > Does anyone see how to debug this? I guess this is due to a deadlock > > somewhere. I have suffered from this problem for almost two years. > > The above log is from stable/9 as of Dec 19, but this have persisted > > since 8.X. > >=20 > Well, I took a quick glance at the log and there are a lot of processes > sleeping on "pfault" (in vm_waitpfault() in sys/vm/vm_page.c). I'm no > vm guy, so I'm not sure when/why that will happen. The comment on the > function suggests they are waiting for free pages. >=20 > Maybe something as simple as running out of swap space or a problem > talking to the disk(s) that has the swap partition(s) or ??? > (I'm talking through my hat here, because I'm not conversant with > the vm side of things.) >=20 > I might take a closer look this evening and see if I can spot anything > in the log, rick > ps: I hope Alan and Kostik don't mind being added to the cc list. What I see in the log is that the lock cascade rooted in the thread 100838, which owns system map mutex. I believe this prevents malloc(9) =66rom making a progress in other threads, which e.g. own the ZFS vnode locks. As the result, the whole system wedged. Looking back at the thread 100838, we can see that it executes smp_tlb_shootdown(). It is impossible to tell from the static dump, is the appearance of the smp_tlb_shootdown() in the backtrace is transient, or the thread is spinning there, waiting for other CPUs to acknowledge the request. But, since the system wedged, most likely, smp_tlb_shootdown spins. Taking this hypothesis, the situation can occur, most likely, due to some other core running with the interrupts disabled. Inspection of the backtraces of the processes running on all cores does not show any which could legitimately own a spinlock or otherwise run with the interrupts disabled. One thing you could try to do is to enable WITNESS for the spinlocks, to try to catch the leaked spinlock. I very much doubt that this is the case. Another thing to try is to switch the CPU idle method to something else. Look at the machdep.idle* sysctls. It could be some CPU errata which blocks wakeup due the interrupt in some conditions in C1 ? --Sz/4MAOlM1c8JZ8/ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJQ5HEbAAoJEJDCuSvBvK1BHqAP/3n+4ZOqUmrwGefE+uorAie+ 2DM6KRS0V0aoGdPvVJRKQlKFRIZ59skFbaYuqrqXrX6by0q2eBXi6gCatZZ1jdP4 RX20zfK9awYnoxAHP4aNILPQrNN5Gfkgdia5tFnygCSJBuAWKDkaeW4yGpLFoDn9 t+ztc7WN7eCC7eXAVL1DGPIkhBvsmaQAjy6uiF4COruO6EopgopIPeQIBHfJgxIb uFeYeGIs4iKCh8C7ErZp2AXI4STKAidwanrKgriq6nnO3oaccKtAV+f5xA6o2Zp1 10t3ikWujhN+6saTPRoDZ0ydDvcHKKOq1d5WLUsMdUcyn5E9rmz4Siwfp4R7MfoJ DhsTHZ8riCkzNbj32lT+Yp3DfV3+bemqQav17xW/sS+Y3h+LQ9FP+Ko3ukxBJcqH JFIqpsGFiltW1HWwTJzULGSvw7OJjtWIM9IN6dDfWAu/Hy5P+lOyfMvXuiZIizYQ 6uLm8xVBs8ayE8IJQaUR6BsD7Mk2r1/9pjAOvmfb+PExN3mf8jqW54eTt6PcYXAn tsx2rVbR580J/TJbCEuFAlZARv/ohiQWIgAHSkcRgzsQbID+qVQI0y5Ce5n0ji72 VM6mLWdKX0h0nEmrCLw+kgiB6ZggRwTycZNCrxf7MPEGgUyQbvZPIE9M2Blol/dZ XK5s6u/jYkNhsiJSgqNt =n3YC -----END PGP SIGNATURE----- --Sz/4MAOlM1c8JZ8/--