From owner-freebsd-stable@FreeBSD.ORG Wed Mar 20 09:49:59 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C8316A0B; Wed, 20 Mar 2013 09:49:59 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 3D802AB1; Wed, 20 Mar 2013 09:49:59 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r2K9nsYr048555; Wed, 20 Mar 2013 11:49:54 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.0 kib.kiev.ua r2K9nsYr048555 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r2K9nsJU048554; Wed, 20 Mar 2013 11:49:54 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 20 Mar 2013 11:49:54 +0200 From: Konstantin Belousov To: Rick Macklem Subject: Re: Core Dump / panic sleeping thread Message-ID: <20130320094954.GV3794@kib.kiev.ua> References: <5148A454.1080303@FreeBSD.org> <153890828.4081736.1363736263509.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="UDXac3CCxvoffKng" Content-Disposition: inline In-Reply-To: <153890828.4081736.1363736263509.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: Jeremy Chadwick , Michael Landin Hostbaek , freebsd-stable@FreeBSD.org, John Baldwin , Andriy Gapon X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Mar 2013 09:49:59 -0000 --UDXac3CCxvoffKng Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Mar 19, 2013 at 07:37:43PM -0400, Rick Macklem wrote: > Andriy Gapon wrote: > > on 19/03/2013 19:35 Jeremy Chadwick said the following: > > > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek > > > wrote: > > [snip] > > >> Unread portion of the kernel message buffer: > > >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock > > >> KDB: stack backtrace of thread 100256: > > >> #0 0xffffffff808f2d46 at mi_switch+0x186 > > >> #1 0xffffffff8092bb52 at sleepq_wait+0x42 > > >> #2 0xffffffff808f34d6 at _sleep+0x376 > > >> #3 0xffffffff80b4f3ae at vm_object_page_remove+0x2ce > > >> #4 0xffffffff80b5ac7d at vnode_pager_setsize+0x17d > > >> #5 0xffffffff8082102c at nfscl_loadattrcache+0x2cc > > >> #6 0xffffffff80818d37 at nfs_getattr+0x287 > > >> #7 0xffffffff8098f1c0 at vn_stat+0xb0 > > >> #8 0xffffffff809869d9 at kern_statat_vnhook+0xf9 > > >> #9 0xffffffff80986b55 at kern_statat+0x15 > > >> #10 0xffffffff80986c1a at sys_lstat+0x2a > > >> #11 0xffffffff80bd7ae6 at amd64_syscall+0x546 > > >> #12 0xffffffff80bc3447 at Xfast_syscall+0xf7 > > >> panic: sleeping thread > > >> cpuid =3D 0 > > >> KDB: stack backtrace: > > >> #0 0xffffffff809208a6 at kdb_backtrace+0x66 > > >> #1 0xffffffff808ea8be at panic+0x1ce > > >> #2 0xffffffff8092ed22 at propagate_priority+0x1d2 > > >> #3 0xffffffff8092fa4e at turnstile_wait+0x1be > > >> #4 0xffffffff808d8d48 at _mtx_lock_sleep+0xd8 > > >> #5 0xffffffff80820fa4 at nfscl_loadattrcache+0x244 > > >> #6 0xffffffff8081758c at ncl_readrpc+0xac > > >> #7 0xffffffff80824c45 at ncl_getpages+0x485 > > >> #8 0xffffffff80b5aa0c at vnode_pager_getpages+0x9c > > >> #9 0xffffffff80b3fc93 at vm_fault_hold+0x673 > > >> #10 0xffffffff80b41cc3 at vm_fault+0x73 > > >> #11 0xffffffff80bd84b4 at trap_pfault+0x124 > > >> #12 0xffffffff80bd8c6c at trap+0x49c > > >> #13 0xffffffff80bc315f at calltrap+0x8 > > [snip] > >=20 > > I think that the regular mutex which is acquired via NFSLOCKNODE() in > > nfscl_loadattrcache() can not be held across vnode_pager_setsize. > > I am not sure though when vap->va_size !=3D np->n_size case is > > triggered. > >=20 > Yep, I'd agree to that. The same bug is in the old NFS client and > the new NFS client cribbed the code from there. >=20 > I have attached a simple patch that unlocks the mutex for the > vnode_pager_setsize() call. Maybe you could test it? >=20 > Thanks for reporting this, rick > ps: Hopefully "patch" can apply this patch (there have been > recent changes to this file, so the line#s could be off). > It should be easy to do manually if not. The change is > in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c. >=20 >=20 > > > You're going to need to provide the following details: > > > > > > 1. Contents of /etc/rc.conf > > > 2. Contents of /etc/sysctl.conf (if modified) > > > 3. Contents of /etc/fstab > > > 4. ifconfig -a > > > 5. OS used by the NFS server, and all configuration details > > > pertaining > > > to that system > > > > > > You may also be asked to upgrade to 9.1-STABLE, as there may be > > > fixes > > > for whatever this is in base/stable/9 that are not in -RELEASE, but > > > this > > > is speculative on my part. > > > > > I do not see a need for any of these. > >=20 > > -- > > Andriy Gapon > > _______________________________________________ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to > > "freebsd-stable-unsubscribe@freebsd.org" > --- fs/nfsclient/nfs_clport.c.savit 2013-03-19 18:37:33.000000000 -0400 > +++ fs/nfsclient/nfs_clport.c 2013-03-19 18:44:21.000000000 -0400 > @@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp,=20 > np->n_size =3D vap->va_size; > np->n_flag |=3D NSIZECHANGED; > } > + NFSUNLOCKNODE(np); > vnode_pager_setsize(vp, np->n_size); > + NFSLOCKNODE(np); > } else { > np->n_size =3D vap->va_size; > } I do not like it. As I said in the previous response to Andrey, I think that moving the vnode_pager_setsize() after the unlock is better, since it reduces races with other thread seeing half-done attribute update or making attribute change simultaneously. --UDXac3CCxvoffKng Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRSYZBAAoJEJDCuSvBvK1BAdIQAItosWsItKlK1fjCuub9R/Q0 0wPSKBIjKmgKHiHtIVEZJz9l9vCsALfRQCqYiFE2U3N5zaQUIXEQl9ZXajzWSOVR uNAXJ+kx7g0ChiwVE9vK8+7LGoW5c6eIJMymefLPZ0B1G3kpGJzqnc90HzEXMB17 Xsdfv+RXzSmstNxbukXk7DwRtRmUtoyaV0t07P5NOUFVnLclgO9ycI2vgmP5tYFe 5r6V78XH5tZLahzs9tMwqEGwTPQWOiveeXLR0mM9QP77hP/16i8dSvmWhkuxZunY abSELYVvDi39yHn8pK+YN1KtVJV5OJoHP4HzMM4wH+NeAyAfZh/bjPJrm1prdqfW BKwryxjj42TxxZrGS2l+gfBnr4EIhJIfPs36dw2p6H7O9oDbhQzmwVt+hJTB0ISZ PjQx9Lxjm7dDYIQdzIMqfMP6jdYwljSjIgsABMONF18p+QGh86o/FpAkuCxnmcqE KnPhMVhEgB/LDXJZBUNK3PWvnytJYZSmnErYKmXABA51R6OUqLNdN0KjdE4OgUXx BXlvNOfkZHog3Efu0jjYhTCGEK9X8oSJFcvotl/XdR5CyMKOzC3qrhbAIPuIOaLK I4wYy8HFeqAD6IR9ZLIwS4HBMm4IS+k2ZztLvpTQn5g08MHM19Q0/Z2J6t8PTUO2 AWclE8ePLKak9nKmdCwm =XLwM -----END PGP SIGNATURE----- --UDXac3CCxvoffKng--