From owner-freebsd-current@FreeBSD.ORG Tue May 14 16:31:58 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3DA608BD for ; Tue, 14 May 2013 16:31:58 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 8E13B88F for ; Tue, 14 May 2013 16:31:57 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r4EGVocx030295; Tue, 14 May 2013 19:31:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r4EGVocx030295 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r4EGVnHM030293; Tue, 14 May 2013 19:31:49 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 14 May 2013 19:31:49 +0300 From: Konstantin Belousov To: Roger Pau Monn? Subject: Re: FreeBSD-HEAD gets stuck on vnode operations Message-ID: <20130514163149.GS3047@kib.kiev.ua> References: <5190CBEC.5000704@citrix.com> <5190F9A0.3000005@citrix.com> <20130513150018.GL3047@kib.kiev.ua> <5192618D.8070501@citrix.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="jmp+x8PXvIAQW6lZ" Content-Disposition: inline In-Reply-To: <5192618D.8070501@citrix.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: "current@freebsd.org" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 May 2013 16:31:58 -0000 --jmp+x8PXvIAQW6lZ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote: > On 13/05/13 17:00, Konstantin Belousov wrote: > > On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote: > >> On 13/05/13 13:18, Roger Pau Monn? wrote: >=20 > Thanks for taking a look, >=20 > >> I would like to explain this a little bit more, the syncer process > >> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues > >> looping forever in what seems to be an endless loop around > >> mnt_vnode_next_active/ffs_sync. Also while in this state there is no > >> noticeable disk activity, so I'm unsure of what is happening. > > How many CPUs does your VM have ? >=20 > 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs. >=20 > >=20 > > The loop you describing means that other thread owns the vnode > > interlock. Can you track what this thread does ? E.g. look at the > > vp->v_interlock.mtx_lock, which is basically a pointer to the struct > > thread owning the mutex, clear low bits as needed. Then you can > > inspect the thread and get a backtrace. >=20 > There are no other threads running, only syncer is running on CPU 1 (see > ps in previous email). All other CPUs are idle, and as seen from the ps > quite a lot of threads are blocked in vnode related operations, either > "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output > of alllocks in the previous email. This is not useful. You need to look at the mutex which fails the trylock operation in the mnt_vnode_next_active(), see who owns it, and then 'unwind' the locking dependencies from there. I described the procedure above. >=20 > >=20 > > Does the loop you described stuck on the same vnode during the whole > > lock-step time, or is the progress made, possibly slowly ? >=20 > I'm not sure how to measure "progress", but indeed the syncer process is > not locked, it is iterating over mnt_vnode_next_active. Progress means that iteration moves from vnode to vnode, instead of looping over the same vnode continuously. I did read what you said about system being un-stuck in some time, but I am asking about change of the iterator during the stuck time. >=20 > >=20 > > I suppose that your HEAD is recent. >=20 > Last commit in my local repository is: >=20 > Date: Tue, 7 May 2013 12:39:14 +0000 > Subject: [PATCH] By request, add an arrow from NetBSD-0.8 to FreeBSD-1.0. >=20 > While here, add a few more NetBSD versions to the tree itself. >=20 > Submitted by: Alan Barrett > Submitted by: Thomas Klausner --jmp+x8PXvIAQW6lZ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJRkmb1AAoJEJDCuSvBvK1BN5YP/1elmjEXa361mqc2XTZqihBF Yf/30qslev1be4fAmMwU3lXqsb2UC/DiPYcdJRTKV0TPmDFXxtAX8Q4Rqyh1Wea8 9E2hcSyY681VV0lpSwLyuR8K4I9EPW3JHjMXx53jKqqqOn5GxrIbUcZZy3VCKSZW P5Zyq2gwgSmaB+tRMvHgRhcLTMRrT5pVUu3oL5lefjjnHsYvifxYbCSbK3D9/ZOe vWyzSSGr9L0255qQsgrLKkuN7hTrGo3ciqITESTeGtdWiSzBt+mV7QtzKPwWzABu uAXxwP0bWif7kv0CVktXpdIOTfb9mZNf7iiUPcj9WOE7mRUXgQQ6fC6q9Uz4u88E UTKRwO2PW9XDyHUwtfSdpZjGteAuOolsKlcY7KEBdaDGpouEsCZnal9ctA53HV5N vlR/6z7dbhrkMN0WuGnEa5j8LJFj2mm4sXUMLR5M/sJeBcRxLa6c4Q0mxpUZXrhJ MLmglKbXYg0aIE0Mgk91OeuXjHiEpTq/a6CmjnRs887GAfETozy4u5ufMxCf+kYv HuTMZkFM4ok1L+Dql+g5VjK01jZjN3Vz8yTZ2Qszkd6dFzB0iBzfaJeUUUNNVKgF 9WVDZKV79Rr+AL+vXubTJb2CHr/9PqNNj8/klnKkGeansDrn1acdTO29Za00sHcz CXQAAm7Mfr1WZzkpTFx8 =KRv/ -----END PGP SIGNATURE----- --jmp+x8PXvIAQW6lZ--